multitouch

Thu Feb 25 17:18:30 PST 2010

On Thu, Feb 18, 2010 at 12:55:09PM +0100, Bradley T. Hughes wrote:
> >sure, I'm probably repeating myself with some of the points below, but it's
> >easier for me to get a train of though going.
> >
> >The basic assumption for multitouch screens is that it will give us multiple
> >touchpoints simultaneously from a single physical device. At this point, we
> >have virtually no user-specific (or body-part specific) information attached
> >to these points. Furthermore, in X we do not know any context-specific
> >information. Any touchpoint other than the first may
> >- belong to a different user
> >- belong to the same user but a different bodypart that qualifies as "new
> >   input device" (think left hand + right hand working independently)
> >- belong to the same user but the same bodypart and be auxiliary (think
> >   thumb+index during pinching)
> 
> This basically hits the nail right on the head. How do we know the
> context of the touch points in the absence of essential information?

We can't. not within the X server. hence the need to find a solution that is
generic enough that it can forward data to context-aware clients but
specific enough that you can have more than one such client running at any
time.

> >In addition to that, any point may be part of a gesture but without the
> >context (i.e. without being the gesture recognizer) it's hard to tell if a
> >point is part of a gesture at all. Worse, depending on the context, the same
> >coordinates may be part of a different gestures.
> >Given two touchpoints that start close to each other and move in diagonally
> >opposite directions, this gesture may be a user trying to zoom, a user
> >trying to pick to items apart or two users fighting for one object. without
> >knowing what's underneath, it's hard to say.
> 
> But this kind of operation is really application dependent, isn't
> it? I mean, the application would have to decide what the user is
> trying to do based on the starting/current/final location of the
> touch points...

correct. and that is one of the reasons why I want any context-specific
information processing (i.e. gestures) in the client. the server cannot have
enough information to even get started.

> >The current idea, not yet completely discarded is to send touchpoints to the
> >client underneath the pointer, with the first touchpoint doing mouse
> >emulation. a touchpoint that started in a client is automatically grabbed
> >and sent to the client until the release, even if the touch is released.
> >thus a gesture moving out of the client doesn't actually go out of the
> >client (behaviour similar to implicit passive grabs).  While such a grab is
> >active, any more touchpoints in this client go through the same channel,
> >while touchpoints outside that client go to the respective client
> >underneath.
> >
> >problem 1: you can't really do multi-mouse emulation since you need a master
> >device for that. so you'd have to create master devices on-the-fly for the
> >touchpoints in other clients and destroy them again. possible, but costly.
> 
> Why not only do mouse emulation for the first client that got the
> first touch point? It does eliminate implicit support for multiple
> user interaction for applications that don't have explicit support
> for multi-touch though. But as a starting point it may work. And
> then see if it's possible to do multiple mouse emulation?

imo, one of the really tempting features of multitouch support in X is to be
able to use it (within reason) with the default desktop. Yes, we could
rewrite the whole desktop to support touch but I'd just like to be able to
do more than that.

> >problem 2: gestures starting outside the client may go to the wrong one. not
> >sure how much that is a problem, I think that's more a sideeffect of a UI
> >not designed for touch.
> 
> I don't think it'll be much of a problem. I doubt we'll see users
> trying to use 2-finger scroll/pinch to manipulate the volume slider
> in the system tray, for example. We made this assumption in Qt...
> applications/components that want multi-touch will be designed in a
> way (e.g. large enough) that allows the user to hit the touch
> target.

Yes, I agree here - this is not a showstopper and is something users will
learn faster than we could write the workaround for anyway :)

> >problem 3: this requires the same device to be grabbed multiple times by
> >different clients, but possible not for mouse emulation. And a client
> >doesn't necessary own a window and events may be sent to multiple clients at
> >the same time, all of which would then need such a grab. I think this is
> >where this approach breaks down, you'd get multiple clients getting the same
> >event and I'm not sure how that'd work out.
> 
> I don't quite understand what you meant by "but possible not for
> mouse emulation"?

s/possible/possibly/
what I meant here was that if a device sends two touch points A and B to two
separate clients, these clients would each have a touch grab on this device
(for the respective touchpoint). If A emulates a mouse event and the client
in return grabs the device, that means that B cannot emulate a mouse event
anymore since the device may be in a sync grab. also, a mouse event through
B may activate a passive grab, which makes freezing and thawing devices
quite interesting.

> I do understand the problem (I think), but unfortunately I'm not
> familiar with the internals to know what this would mean in
> practice. Considering that each touch-point would be given
> implicit-grabs, not the device, perhaps this is an indication that
> it makes sense (like Carsten mentioned) to have each touch-point a
> separate device in XI2?
> 
> But then I wonder... what happens when one client tries to grab the
> entire multi-touch device? hm...

Adding additional slave devices is rather costly - too costly to do it on
the fly imo. You could have a static number of slave devices for touchpoints
that are enabled/disabled as the touchpoints become available but then you
still have to deal with attachments, etc. and the cost itself doesn't really
get much less.
There is definitely an argument for having the touch device represented as
one device (per user maybe?) in the server and having the touch points
forwarded on as sub-devices or something similar.

Cheers,
  Peter