[Spice-devel] [RFC/POC PATCH 00/16] add output_id to monitors_config

Tue Jun 19 13:15:02 UTC 2018

On Tue, 2018-06-19 at 08:11 -0400, Frediano Ziglio wrote:
> > 
> > On Fri, 2018-06-15 at 15:12 +0200, Marc-André Lureau wrote:
> > > Hi
> > > 
> > > On Fri, Jun 15, 2018 at 1:01 PM, Lukáš Hrázký <lhrazky at redhat.com> wrote:
> > > > On Fri, 2018-06-15 at 12:24 +0200, Marc-André Lureau wrote:
> > > > > Hi
> > > > > 
> > > > > On Fri, Jun 15, 2018 at 12:16 PM, Lukáš Hrázký <lhrazky at redhat.com>
> > > > > wrote:
> > > > > > On Thu, 2018-06-14 at 21:12 +0200, Marc-André Lureau wrote:
> > > > > > > Hi
> > > > > > > 
> > > > > > > On Tue, Jun 5, 2018 at 5:30 PM, Lukáš Hrázký <lhrazky at redhat.com>
> > > > > > > wrote:
> > > > > > > > Hi everyone,
> > > > > > > > 
> > > > > > > > following is my attempt at solving the ID issues with
> > > > > > > > monitors_config
> > > > > > > > and streaming. The concept is as follows:
> > > > > > > > 
> > > > > > > 
> > > > > > > Before introducing a new solution, could you describe the "issues
> > > > > > > with
> > > > > > > monitors_config and streaming"? I am not following closely enough
> > > > > > > the
> > > > > > > mailing list probably, so a recap would be welcome. I'd like to
> > > > > > > understand the shortcomings of the current messages, and see if we
> > > > > > > are
> > > > > > > on the same page about it. Thanks!
> > > > > > 
> > > > > > Right, sorry about that!
> > > > > > 
> > > > > > The issue is when having a plugin a for the streaming agent that is
> > > > > > using a HW-accelerated encoder. The typical case is that you have a
> > > > > > QXL
> > > > > > monitor in the guest which shows the boot and then you have a
> > > > > > physical
> > > > > > HW device (either directly assigned to the VM or a vGPU) which is
> > > > > > configured in the X server.
> > > > > > 
> > > > > > Once the X server starts, the QXL monitor goes blank and you get a
> > > > > > second monitor on the client with the streamed content from the
> > > > > > streaming agent plugin.
> > > > > > 
> > > > > > The problem is the code and the protocol assumes (amongst other
> > > > > > assumptions) that all monitors are configured in X. The monitors on
> > > > > > the
> > > > > > client are put into an array and the index is used as the ID
> > > > > > throughout
> > > > > > SPICE. So for the case above, the QXL monitor is ID 0 and the
> > > > > > streamed
> > > > > > monitor is ID 1.
> > > > > 
> > > > > 
> > > > > Since the "streamed monitor" replaces the QXL monitor, wouldn't it
> > > > > make sense for them to share the same ID? This would be handled by the
> > > > > server/guest transparently.
> > > > 
> > > > Well... for one, it does not entirely replace it. From a user
> > > > experience PoV, yes, it makes sense to "replace" one monitor for the
> > > > other. We've even considered switching the content of the display
> > > > channel from QXL to the streamed one. But in the system, they are
> > > > different monitors on different channels, so it's a bad idea to give
> > > > them the same ID. And the ID we're talking about is actually (for the
> > > > monitors config messages):
> > > 
> > > So if it's the same from use PoV, the client API shouldn't need to change.
> > 
> > I didn't say it's the same from the user PoV, not entirely. We can
> > assume he want's to either be looking at the QXL monitor or the
> > streamed one, i.e.:
> > 
> > 1. VM is booting, we are showing QXL monitor
> > 2. X started, we switch to showing the streamed monitor
> 
> Note: this is also due to X limitation not supporting multiple devices,
> on Windows probably you'll have both used at the same time and probably
> user will manually disable QXL (as wants to use the other)
> 
> > 3. The user switches to a different TTY, we switch back to QXL
> > etc...
> > 
> > This is for the simple case of one QXL monitor and one streamed
> > monitor. But we are still "hiding" from the user that his VM actually
> > has two graphics devices, the QXL one and a physical (or virtual) GPU.
> > 
> > I once advocated the approach to switch the monitors for the users, but
> > now I think it would actually complicate things more and could
> > potentially cause problems.
> > 
> 
> I think the possibility to have the switch would be interesting and help
> but I think the actual protocol is limiting and potentially complicating
> stuff.
> 
> > > The client only cares about about having a specific set of monitors
> > > configuration. The way the server/guest handled them is not its
> > > business, it expects the best configuration.
> 
> True, the client should implement the protocol and server should use
> the protocol as best as it can. This does not however mean that the
> current protocol is perfect and using it as it is don't limit us
> and make the code a bunch of spaghetti code in order to make protocol
> happy.

Actually, I've realized this is an interesting point by Marc-André, the
client doesn't neccessarily need to care for the output_id I've added
to the protocol.

There was a suggestion by Uri to keep the output_id in a map on the
server and not send it to the client and back. That would mean no
protocol change. However, it is not possible because of the ID
discrepancy. There is no reliable key to use in the server on both the
sending and the receiving side.

It might be an idea to just fix the IDs in the protocol and still use
the map instead of sending the output_id through the client.

> > That is true, but the monitors configuration needs to reflect the
> > actual configuration of the guest. If we mix it up for the clients
> > convenience, problems... :)
> > 
> > > So far, we have the current simplified model (ie other more complex
> > > combinations are not supported):
> > > 
> > > qxl device, display channel 0,
> > >   monitor 0:
> > >   monitor 1:
> > >   monitor x..
> > > 
> > > 
> > > or
> > > 
> > > qxl device, display channel 0,
> > >   monitor 0:
> > > qxl device, display channel 1,
> > >   monitor 0:
> > > ..
> > > 
> > > Feel free to correct me, I haven't been looking into this for many years
> > > now.
> > 
> > Correct.
> > 
> 
> True. Note that this model however have physical (sockets and connections)
> repercussions which could be hard to maintain.
> If we decide to keep the current protocol and have one device we end up
> with a huge surface 0 with all monitors in it having to mux and unmux streamings
> and commands in both server and clients having to implement fair queueing and
> synchronization just to maintain an old design.
> Said that this approach would be prohibitive and looking at X implementation
> requiring a single surface this means that if streaming devices are used we must
> limit to one QXL with 1 monitor only and "map" all monitors to different channels
> (with one monitor each).

Tbh. I'm kind of lost here, I think this branch of the discussion is
mostly irrelevant, in case it's not, please let me know.

> > > > 
> > > > server -> client: a pair (channel_id, monitor_id)
> > > > 
> > > > client -> server: channel_id + monitor_id, converted to an array index
> > > > under the assumption I mentioned
> > > > 
> > > > So we can't really make it the same even if we wanted.
> > > 
> > > With multi-head/xrandr qxl:
> > > 
> > > qxl device, display channel 0,
> > >   monitor 0:
> > >   monitor 1:
> > >   monitor x..
> > > gpu device, stream channel 0,
> > >   maps to display channel 0, monitor 0
> > > gpu device, stream channel 1,
> > >   maps to display channel 0, monitor 1
> > > ...
> > > 
> > > With multihead/xrandr qxl & gpu:
> > > qxl device, display channel 0,
> > >   monitor 0:
> > >   monitor 1:
> > >   monitor x..
> > > gpu device, stream channel 0,
> > >   maps to display channel 0, monitor 0
> > >   maps to display channel 0, monitor 1
> > > ...
> > > 
> > > With multiple QXL devices:
> > > 
> > > qxl device, display channel 0,
> > >   monitor 0:
> > > qxl device, display channel 1,
> > >   monitor 0:
> > > gpu device, stream channel 0,
> > >   maps to display channel 0, monitor 0
> > > gpu device, stream channel 1,
> > >   maps to display channel 1, monitor 0
> > 
> > But, regardless of multihead/multiple devices, there is no guarantee
> > that the number of QXL monitors is going to match the number of GPU
> > device monitors. In fact, the most probable scenario is one QXL monitor
> > and multiple GPU device monitors.
> > 
> 
> This is not an issue, would be just a problem of allocating enough
> channels to support all monitors we need.
> 
> > I don't think it makes sense to have more than one QXL monitor anyway.
> > And if you did, it makes no sense to map them to GPU device monitors
> > besides the first one, I guess...
> > 
> > Also, it is unclear what you mean by the "mapping", i.e. how exactly
> > would you implement it?
> > 
> 
> Simply to the client you present a channel_id, monitor_id which can
> be different on the server.

The hair on my neck is rising when I'm just thinking of this :D

(Don't mess with the IDs!)

> > > If you want to support QXL devices that should not be associated with
> > > a gpu/stream channel, then you should be able to flag them.
> 
> Not clear what do you want with this. A QXL device should be either be
> seen by the client or not. You need to flag if seen or not.
> 
> > > 
> > > If you need a manual association of stream channel with qxl/monitor,
> > > this could be done with a manual configuration.
> 
> I don't see a case where I would need a manual configuration, I mean,
> should be configurable within the guest disabling/enabling devices
> and/or monitors.
> 
> > > 
> > > Imho, we should avoid making things more complex. Testing is hard
> > > enough. We should actually take the simplest approach possible (the
> > > same decision we took before, it's already a mess to deal with, as you
> > > noticed)
> 
> You are stating that the current situation is complex and also we need
> to keep things simple. Looks like a bit contradictory.
> Maybe the current situation is complex, or feel more complex than it
> should be, for the choices done and current implementation.
> From my point of view is just that client need to tell "I want this
> monitor this size" and "I'm moving the mouse here on this monitor".
> The problem is defining "this monitor" at protocol level from client
> to server, including messages to agent.
> 
> > 
> > I fully agree with the notion :) However, we are already making things
> > much more complex by adding the GPU device streaming feature. It breaks
> 
> I don't think the problem is related to streaming, more about devices
> which are not QXL. With only QXL all old assumptions are easy to maintain,
> the problem is the new devices. For instance having QXL 0 as Xrandr ID 0
> and QXL 1 as Xrandr ID 1 is pretty easy, is the same driver and actually
> we also manage (at code level!) the guest device driver. Just imagine if
> the driver changes the order or physical device scanning having QXL 0 as
> Xrandr 1 and QXL 1 as Xrandr 0! This would break the channel_id+monitor_id
> formula. This does not happen as we maintain it, but we (SPICE) don't
> maintain Intel/Nvidia/ATI drivers or Xorg/Wayland/Windows!

Yeah, this, though it doesn't seem likely, is something we should be
aware of. It doesn't break my current implementation though, because
(AFAICS it's going to be implemented in multi-monitor support for
stremaing) the streaming agent gets the IDs from xrandr and in the end
the IDs will again be used with xrandr, so as long as we keep them
updated, the IDs will match.

> I'm relative new in this group (3 years) compared to these stuff,
> some are possibly not in current git repos, but I think that more or
> less the history is:
> - before multimonitor. No much issues the monitor was the monitor;
> - start adding multi monitor support, only one device was used. The
>   problem was only sending the ID of the monitor which was 0, 1, ...
>   Added a monitor_config message in the SPICE protocol from server
>   to client. The client could resize the monitors sending a message
>   to the agent through the server, server was not involved. As there
>   was only a QXL device and no other devices there was no reason to send
>   information for channel/qxl device and monitor IDs matched Xrandr IDs;
> - for some reasons on Windows was not great, somebody decided to use
>   multiple QXL devices each with one monitor. As a workaround not having
>   the channel in the protocol to the agent the channel_id+monitor_id formula
>   was introduced to create an ID fitting into the existing one (for the
>   agent);
> - somebody realized that on Linux was possible to handle resolution
>   changes talking directly to the device making possible resize without
>   using the agent. This was implemented with a change in device modes
>   (rom/ram) and an interrupt to the card handled by our Linux kernel
>   driver. As on Linux there is only one device, spice-server just
>   "redirects" the message for the agent sent by the client to the QXL
>   card. This works on Windows as Windows driver does not handle this
>   interrupt/feature (or it would replicate monitor settings on all
>   cards!).
> Possibly some wrong assumptions, please correct me.
> Note: is not clear to me the history of the mouse events, currently
> spice-server sends messages to the agent for the mouse.
> 
> > previous assumptions and adds more stuff on top of it. And I think your
> > suggestion adds even more complexity than necessary. Perhaps not in the
> > client, but the "mapping" code, which sounds rather non-trivial, needs
> > to go somewhere, presumably the server?
> > 
> 
> I think too that bending to the old "design" will do stuff more
> complex.
> 
> > It is actually my intention to add as little complexity as possible
> > with my patches. The fact is, both the current code and the protocol is
> > not robust enough and need to be fixed. And even if they were robust
> 
> I would not speak about robustness and fix, from my point of view looks
> more a problem of flexibility. Simply we now have new use cases to
> support.

Yeah, that's mostly what I meant to say.

> > enough, the protocol would need to be extended either way. On top of
> > that, the backwards compatibility turns the solution into a very
> > complex one. But if we ever could deprecate and drop the old code
> > paths...
> > 
> 
> I have same sensation, not changing protocol will make thing much
> more complicated and possibly limited.
>
> > > > > > The more pressing issue with this is that a mouse motion event is
> > > > > > sent
> > > > > > with a display_id 1 from the client, but when it reaches the guest,
> > > > > > the
> > > > > > correct monitor is ID 0, as it is the only one.
> > > > > > 
> > > > > > The other thing this breaks is monitors_config messages that
> > > > > > enable/disable displays and change the resolution. Besides the same
> > > > > > "shift" happening here as well, another problem is the information
> > > > > > that
> > > > > > the monitors belong to different devices (or channels) is erased when
> > > > > > they're all put into the single array (under the assumption there is
> > > > > > either one channel with multiple monitors or multiple channels with
> > > > > > one
> > > > > > monitor each).
> > > > > 
> > > > > Correct and so far was a fine solution.
> > > > 
> > > > I respectfully disagree. Doing the channel_id + monitor_id I mentioned
> > > > above is asking for trouble, which is exactly what we have now.
> > > 
> > > With the limitation we decided, we avoid the biggest troubles though.
> 
> We avoid to change the protocol but to implement the new use cases
> we introduce potentially many assumptions and code complexity.
> Not saying that is hard to discuss new code thinking about design
> which nobody wrote down and that is bound to old code which nobody
> fully remember.
> 
> > 
> > I take it you were involved with the original implementation then...
> > I'm sorry for being so critical, it is not my intention to be mean :)
> > 
> > However, I am not criticising the limitations you chose to impose, but
> > the implementation. I find two problems here:
> > 
> > 1. Not keeping the IDs of the monitors_config messages as a
> > (channel_id, monitor_id) pair and instead abusing the limitation you
> > introduced to change it to a singe ID. I sort of consider it a hard
> > rule you always keep your unique IDs intact and always keep them with
> > the data. That way they're there when you need them and they work
> > (because you didn't mess with them :)).
> > 
> 
> I think this single ID came from the history I wrote above (that could
> be partly wrong).

Indeed, I haven't realized that possible sequence of events. In that
case it kind of proves the point that when you take shortcuts to make
things easier, you're just digging yourself a hole...

> I personally would add channel_id and monitor_id to the new messages,
> this to make possible to send the correct modes to the card in every
> possible case. Saving few bytes for these messages (which are not
> that frequent) is not worth it.

Yes, I've already decided to do that in the next iteration.

> > 2. The change in the IDs and the actual creation of the monitors_config
> > list leaks over the spice-gtk API to the client application (via the
> > spice_main_update_display{,_enabled}() function). It is unnecessary,
> > gives more opportunity to the client app to break it, and any extension
> > of the monitors_config message requires a spice-gtk API change.
> > 
> > You could have imposed the limitation you did (I'm not entirely sure
> > what was the reason) and still make the implementation more robust by
> > not doing the above...
> > 
> > And for these reasons this patch series is more complicated that it
> > could have been (though as I said, the protocol extension, i.e. adding
> > the output_id) would still be necessary.
> > 
> > I'm not saying this to pick on you, I'm not even sure how you were
> > involved (didn't go digging who wrote the code), but more as an
> > explanation and so that we all can learn from it for the future :)
> > 
> 
> Usually complex code is the result of many passages and possibly
> shortcuts, usually involving different people and undocumented
> stuff. Picking on people is often unfair.
>
> > > > > Either you do multiple monitor over a single channel, or you use
> > > > > multiple
> > > > > channels, each with one monitor.
> > > > 
> > > > That is a limitation you can introduce, perhaps there was a reason for
> > > > it. But the code is taking shortcuts and borderline hacks because of
> > > > this, while doing it right wouldn't be much harder. I'm not looking to
> > > > blame or judge anyone, I don't know how this came into place. But what
> > > > it is now is (IMO) a bad design and I feel the need to say it after
> > > > working on it for quite some time ;)
> > > 
> > > Doing it differently would lead to push the complexity higher up the
> > > stack, perhaps even to the user. We wanted to avoid that.
> 
> I don't see how this could reach the user.
> 
> > 
> > Yeah, as I said, not sure what the reason was, but IMO you could have
> > had that and still make it much more robust.
> 
> I cannot say a car has a bad design because it does not have the load
> of a big truck, just a different use case. We probably started using
> the car (to continue the metaphor) to load some stuff and we are now
> exceeding the limit. At the beginning we though a car was enough :-)
> 
> > 
> > Oh and BTW, not sure how old this is, but I'm sure I'd have much worse
> > things to say about the code _I_ wrote say like 8 years ago ;)
> > 
> > > > > > Hope this explains it well enough. It's all very complex and goes
> > > > > > over
> > > > > > multiple interfaces (the network protocols as well as the spice-gtk
> > > > > > API), so the more brilliant ideas you'll have, the more welcome they
> > > > > > will be :)
> > > > > 
> > > > > If we can avoid modifying all the layers and exposing the inner
> > > > > details, I would encourage the exploration of other solution.
> > > > 
> > > > I don't see how we can do that... And the inner details are already
> > > > quite exposed, unfortunately :(
> > > > 
> > > > As I said, ideas welcome.
> > > > 
> > > > > > > > 1. The streaming-agent sends a new StreamInfo message when it
> > > > > > > > starts
> > > > > > > > streaming. The message contains the output_id of the streamed
> > > > > > > > monitor.
> > > > > > > > The actual value is the index in the list of xrandr outputs.
> > > > > > > > Basically,
> > > > > > > > the number should uniquely identify a monitor in the guest
> > > > > > > > context under
> > > > > > > > X.
> > > > > > > > 
> > > > > > > > 2. The server copies the number into the
> > > > > > > > SpiceMsgDisplayMonitorsConfig
> > > > > > > > message and sends it to the client as part of the monitors_config
> > > > > > > > exchange.
> > > > > > > > 
> > > > > > > > 3. The client stores the output_id in it's list of
> > > > > > > > monitor_configs. Here
> > > > > > > > I had to make significant changes, because the monitors_config
> > > > > > > > code in
> > > > > > > > spice-gtk is very loosely written and also exposes quite a few
> > > > > > > > unnecessary details to the client app. The client sends the
> > > > > > > > output_id
> > > > > > > > back to the server as part of the monitors_config messages
> > > > > > > > (VDAgentMonitorsConfigV2) and also uses it for the mouse motion
> > > > > > > > event,
> > > > > > > > which also contains a display ID interpreted by the vd_agent. In
> > > > > > > > the
> > > > > > > > end, the API/ABI towards the client app should remain unchanged.
> > > > > > > > 
> > > > > > > > 4. The server passes the output_id in above-mentioned messages to
> > > > > > > > the
> > > > > > > > vd_agent. The output_id is meaningless in the server context.
> > > > > > > > (Currently, it doesn't pass the monitors_config messages if there
> > > > > > > > is a
> > > > > > > > QXL device that supports it, though. Needs more work.)
> > > > > > > > 
> > > > > > > > 5. vd_agent:
> > > > > > > >   a) For mouse input, the output_id was passed in the original
> > > > > > > >   message,
> > > > > > > >   so no change needed here, it works.
> > > > > > > > 
> > > > > > > >   b) If the server sends monitors_config to the guest, the
> > > > > > > >   vdagent will
> > > > > > > >   prefer to use monitors_configs with the output_ids set, if such
> > > > > > > >   are
> > > > > > > >   present. In that case, it ignores monitors_configs with the
> > > > > > > >   output_id
> > > > > > > >   unset. If no output_ids are present, it should behave as it
> > > > > > > >   used to.
> > > > > > > > 
> > > > > > > > A couple of things to note:
> > > > > > > > - While I did copy the VDAgentMonitorsConfig to a different
> > > > > > > > message for
> > > > > > > > backwards compatibility, I didn't do the same for
> > > > > > > > SpiceMsgDisplayMonitorsConfig, it remains to be done.
> > > > > > > > 
> > > > > > > > - I didn't introduce any capabilities to handle the
> > > > > > > > compatibility, also
> > > > > > > > remains to be done. Hopefully it is also clear it will be quite a
> > > > > > > > non-trivial job to do that :( Ain't gonna make the code prettier
> > > > > > > > either.
> > > > > > > > 
> > > > > > > > For your convenience, you can also pull the branches below:
> > > > > > > > https://gitlab.freedesktop.org/lukash/spice-protocol/tree/monitors-config-poc
> > > > > > > > https://gitlab.freedesktop.org/lukash/spice-common/tree/monitors-config-poc
> > > > > > > > https://gitlab.com/lhrazky/spice-streaming-agent/tree/monitors-config-poc
> > > > > > > > https://gitlab.freedesktop.org/lukash/spice/tree/monitors-config-poc
> > > > > > > > https://gitlab.freedesktop.org/lukash/spice-gtk/tree/monitors-config-poc
> > > > > > > > https://gitlab.freedesktop.org/lukash/vd_agent/tree/monitors-config-poc
> > > > > > > > 
> > > > > > > > All in all, it's big, complex and invasive. Note I did review the
> > > > > > > > emergency instructional video [1] and am therefore ready for any
> > > > > > > > bombardment you'll be sending my way :D (Don't hesitate to
> > > > > > > > contact me
> > > > > > > > with questions either)
> > > > > > > > 
> > > > > > > > Last minute note: I've noticed some of the patches are missing
> > > > > > > > Signed-off-by line, since they are not for merging, should not be
> > > > > > > > an
> > > > > > > > issue...
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Lukáš Hrázký (16):
> > > > > > > > spice-protocol
> > > > > > > >   Add the StreamInfo message
> > > > > > > >   Create a version 2 of the VDAgentMonitorsConfig message
> > > > > > > > spice-common
> > > > > > > >   add output_id to SpiceMsgDisplayMonitorsConfig
> > > > > > > > spice-streaming-agent
> > > > > > > >   Send a StreamInfo message when starting to stream
> > > > > > > > spice-server
> > > > > > > >   Handle the StreamInfo message from the streaming agent
> > > > > > > >   Use VDAgentMonitorsConfigV2 that contains the output_id field
> > > > > > > > spice-gtk
> > > > > > > >   Rework the handling of monitors_config
> > > > > > > >   Remove the n_display_channels check when sending
> > > > > > > >   monitors_config
> > > > > > > >   Use an 'enabled' flag instead of status enum in monitors_config
> > > > > > > >   Use VDAgentMonitorsConfigV2 as the message for monitors_config
> > > > > > > >   Add output_id to the monitors_config
> > > > > > > >   Use the new output_id as display ID for the mouse motion event
> > > > > > > > vd_agent
> > > > > > > >   vdagent: Log the diddly doo X11 error
> > > > > > > >   Improve/add some logging messages
> > > > > > > >   Use VDAgentMonitorsConfigV2 instead of VDAgentMonitorsConfig
> > > > > > > >   vdagent: Use output_id from VDAgentMonitorsConfigV2
> > > > > > > > 
> > > > > > > > [1] https://www.youtube.com/watch?v=IKqXu-5jw60
> > > > > > > > 
> 
> Frediano