[PATCH v4 2/8] drm/atomic: Add support for mouse hotspots

Thu Jul 6 16:23:46 UTC 2023

On 7/6/23 01:01, Pekka Paalanen wrote:
> On Wed, 5 Jul 2023 09:08:07 -0700
> Michael Banack <banackm at vmware.com> wrote:
>
>> On 7/4/23 01:08, Pekka Paalanen wrote:
>>> On Mon, 3 Jul 2023 14:06:56 -0700
>>> Michael Banack <banackm at vmware.com> wrote:
>>>   
>>>> Hi, I can speak to the virtual mouse/console half of this from the
>>>> VMware-side.
>>>>
>>>> I believe Zack's preparing a new set of comments here that can speak to
>>>> most of your concerns, but I'll answer some of the other questions directly.
>>>>
>>>> On 6/29/23 01:03, Pekka Paalanen wrote:
>>>>> Is it really required that the hotspot coordinates fall inside the
>>>>> cursor plane? Will the atomic commit be rejected otherwise?
>>>> Most console systems require the hotspot to get within the cursor image,
>>>> but in theory it's semantically meaningful to have it extend outside the
>>>> image.
>>>>
>>>> VMware's clients in particular will clamp the hotspot to the dimension
>>>> of the cursor image if we receive one that's out of bounds.
>>>>
>>>> So I would assume the right thing to do here would be to allow it and
>>>> let the clients figure out how to best handle it.
>>> Hi,
>>>
>>> if it is normal that clients clamp the hotspot to inside the cursor
>>> image, then I would come to the opposite conclusion: KMS UAPI needs to
>>> require the hotspot to be within the cursor image. Otherwise the
>>> results would be unpredictable, if clients still continue to clamp it
>>> anyway. I would assume that clients in use today are not prepared to
>>> handle hotspot outside the cursor image.
>>>
>>> It is also not a big deal to require that. I think it would be very rare
>>> to not have hotspot inside the cursor image, and even if it happened,
>>> the only consequence would be that the guest display server falls back
>>> to rendered cursor instead of a cursor plane. That may happen any time
>>> anyway, if an application sets e.g. a huge cursor that exceeds cursor
>>> plane size limits.
>> Hypervisors are normally more privileged than the kernel, so any
>> hypervisor/remoting client here really should be dealing with this case
>> rather than trusting the kernel to handle it for them.
> Sorry, handle what? Trust the guest kernel to do what?
>
> Personally I'm only interested in the KMS UAPI the guest kernel offers
> to guest userspace, and requiring hotspot to be inside the cursor image
> is fine. I don't think it needs even a strong justification, it's what
> most would likely expect, and expectations are good to record in spec.
>
> The UAPI contract is between (guest) kernel and (guest) userspace, and
> I expect the kernel to fully enforce that towards the userspace.
>
> I understand that hypervisors cannot trust guest kernels for security,
> but I also think that's a different matter.

You were saying that results would be unpredictable if the kernel 
allowed hotspots to be outside the dimensions of the cursor image. I'm 
not clear in what way you think that would cause unpredictable results, 
or what problems that would cause?

Essentially setting the hotspot properties means that the hypervisor 
console can choose to either draw the cursor where the plane is located, 
or use the cursor-plane + hotspot information to draw the cursor where 
the user's mouse is on the client.

That works the same whether the hotspot is clamped or not.  We mostly 
use clamping to avoid pathological cases (like a hotspot ot MAX_UINT32), 
and get away with it because real Guest applications that do this are 
very rare.
>>>>> The question of which input device corresponds to which cursor plane
>>>>> might be good to answer too. I presume the VM runner is configured to
>>>>> expose exactly one of each, so there can be only one association?
>>>> As far as I know, all of the VM consoles are written as though they
>>>> taking the place of what would the the physical monitors and input
>>>> devices on a native machine.  So they assume that there is one user,
>>>> sitting in front of one console, and all monitors/input devices are
>>>> being used by that user.
>>> Ok, but having a single user does not mean that there cannot be
>>> multiple independent pointers, especially on Wayland. The same with
>>> keyboards.
>> True, and if the userspace is doing anything complicated here, the
>> hypervisor has to be responsible for ensuring that whatever it's doing
>> works with that, or else this system won't work.  I don't know that the
>> kernel is in a good position to police that.
> What do you mean by policing here?
>
> Isn't it the hypervisor that determines what virtual input devices will
> be available to the guest OS? Therefore, the hypervisor is in a
> position to expose exactly one pointer device and exactly one
> cursor plane to guest OS which means the guest OS cannot get the
> association wrong. If that's the general and expected hypervisor
> policy, then there is no need to design explicit device association in
> the guest kernel UAPI. If so, I'd like it to be mentioned in the kernel
> docs, too.

I'm not entirely sure how to fit what you're calling a "pointer" into my 
mental model of what the hypervisor is doing...

For a typical Linux Guest, we currently expose 3+ virtual mouse devices, 
and choose which one to send input to based on what their guest drivers 
are doing, and what kind of input we got from the client.  We expect the 
input from all of those to end up in the same user desktop session, 
which we expect to own all the virtual screens, and that the user the 
only gets one cursor image from that.

But we think of that as being a contract between the user desktop and 
the hypervisor, not the graphics/mouse drivers.  I might be out of touch 
with how Wayland/KMS thinks of this, but normally the user desktop is 
receiving the mouse events (in terms of either relative dx/dy, or 
absolute mouse device coordinates [0, MAX_UINT32] or something) and 
mapping those onto specific pixels in the user's desktop, which then 
gets passed up to the graphics driver as the location of the mouse cursor.

So I guess I'm not clear on what kind of usermode<=>kernel contract you 
want here if the kernel isn't what's owning the translation between the 
mouse input and the cursor position.  The hypervisor awkwardly has to 
straddle both the input/graphics domain, and we do so by making 
assumptions about how the user desktop is going to behave.

 From VMware's perspective, I think it's fair to document that all input 
devices are expected to feed into the same single cursor plane.  Or to 
generalize that slightly, if a virtual graphics driver chose to expose 
multiple cursor planes, then I think noting that it's the hypervisor's 
responsibility to ensure that it's synchronizing the correct cursor 
hotspot with the correct user pointer is probably also fair, but we 
would be extrapolating past what anyone is doing today (as far as I'm 
aware).

>
>>>   
>>>> Any more complicated multi-user/multi-cursor setup would have to be
>>>> coordinated through a lot of layers (ie from the VM's userspace/kernel
>>>> and then through hypervisor/client-consoles), and as far as I know
>>>> nobody has tried to plumb that all the way through.  Even physical
>>>> multi-user/multi-console configurations like that are rare.
>>> Right.
>>>
>>> So if there a VM viewer client running on a Wayland system, that viewer
>>> client may be presented with an arbitrary number of independent
>>> pointer/keyboard/touchscreen input devices. Then it is up to the client
>>> to pick one at a time to pass through to the VM.
>>>
>>> That's fine.
>>>
>>> I just think it would be good to document, that VM/viewer systems
>>> expect to expose just a single pointer to the guest, hence it is
>>> obvious which input device in the guest is associated with all the
>>> cursor planes in the guest.
>> I don't have a problem adding something that suggests what we think the
>> hypervisors are doing, but I would be a little cautious trying to
>> prescribe what the hypervisors should be doing here.
> If the UAPI has been designed to cater for specific hypervisor
> configurations, then I think the assumptions should definitely be
> documented in the UAPI. Hypervisor developers can look at the UAPI and
> see what it caters for and what it doesn't. It's not a spec for what
> hypervisors must or must not do, but an explanation of what works and
> what doesn't given that guest userspace is forced to follow the UAPI.
>
> If there is no record of how the input vs. output device association is
> expected to be handled, I will be raising questions about it until it
> is.
>
> Having limitations is fine, but they need to be documented.

I think my confusion here is that if we were to try and support 
multi-user or multi-pointer sessions, our instinct would probably be to 
bypass the kernel entirely and work with a userspace<->hypervisor 
channel for communicating what the user desktops think the session 
topology is.

But as I noted above, I think it's fair to document that this is all 
assumed to be working in an environment where there is one cursor plane 
shared across all displays, and all input devices used by the hypervisor 
are processed as part of that session.  (If that's what you're looking 
for...)

>
>> I certainly can't speak for all of them, but we at least do a lot of odd
>> tricks to keep this coordinated that violate what would normally be
>> abstraction layers in a physical system such as having the mouse and the
>> display adapter collude.  Ultimately it's the hypervisor that is
>> responsible for doing the synchronization correctly, and the kernel
>> really isn't involved there besides ferrying the right information down.
> Are you happy with that, having to chase and special-case guest OS quirks?
>
> Or would you rather know how a guest Linux kernel expects and enforces
> guest userspace to behave, and develop for that, making all Linux OSs
> look fairly similar?
>
> You have a golden opportunity here to define how a Linux guest OS needs
> to behave. When it's enshrined in Linux UAPI, it will hold for decades,
> too.

I mean, we're certainly happy to make this as nice as possible for 
ourselves and others, but when we're trying to support OS's from the 
last 30+ years, we end up with a lot of quirks no matter what we do.

I mentioned earlier about the display<=>input mapping, but the model we 
use internally is closer to what a desktop manager is doing that a 
kernel.  So each virtual display is rooted at a point in the topology 
that corresponds to the user desktop's idea of how the mouse moves 
around the screens, and then we use that to map client mouse coordinates 
into whatever space the input device is using so that the guest's 
desktop send the mouse to the correct location.

I'm not a KMS expert either, but I thought that the X11/Wayland 
component was still doing that display<=>mouse mapping and the kernel 
just matched up the display images with the monitors.

>
>>> Btw. what do you do if a guest display server simultaneously uses
>>> multiple cursor planes, assuming there are multiple outputs each with a
>>> cursor plane? Or does the VM/viewer system limit the number of outputs
>>> to one for the guest?
>> Zack would have to confirm what the vmwgfx driver does, but the VMware
>> virtual display hardware at least only has one cursor position.  So I
>> would assume that vmwgfx tries to only expose one plane and the rest get
>> emulated, or else it just picks one to set live, but I'm not an expert
>> on vmwgfx.
> Right. I would not expect a guest driver to invent more virtual devices
> than what the hypervisor exposes.
>
> I believe that using universal planes KMS UAPI, a guest display driver
> can also expose a single cursor plane that can migrate between CRTCs.
>
>> Normally we try to run a userspace agent in the Guest that also helps
>> coordinate screen positions/resolutions to match what the user wanted on
>> their client.  So when a user connects and requests from our UI that
>> they want the screens to be a particular configuration, we then send a
>> message to the userspace agent which coordinates with the display
>> manager to request that setup.  You can certainly manually configure
>> modes with things like rotation/topologies that break the console mouse,
>> but we try not to put the user into that state as much as possible.
>> Multiple cursors in the Guest display manager probably fall into that
>> category.
> That sounds like something that only works with Xorg as the guest
> display server, as X11 allows you to do that, and Wayland does not.
>
> You could do similar things through the guest kernel display driver by
> manufacturing hotplug events and changing read-only KMS properties
> accordingly, at least to some degree.
Yeah, what we have now is definitely X11-focused.  We've certainly 
thought about using hotplug events for controlling the display updates, 
and might move that direction someday.

>
> At some point, extending KMS for virtualized use cases stops being
> reasonable and it would be better to connect to the guest using VNC,
> RDP, or such. But I think adding hotspot properties is on the
> reasonable side and far from that line.

Possibly, yeah.  I mean, so far I don't think we're talking much about 
additional extensions (beyond the hotspot), but rather additional 
restrictions on what the desktop manager is doing.  But if more exotic 
usage of KMS becomes normal then that would be an interesting time to 
look at other options.

--Michael Banack