Another approach to multitouch handling

Wed Jun 9 23:19:29 PDT 2010

On Mon, Jun 07, 2010 at 01:20:39AM -0400, Rafi Rubin wrote:
> On 06/07/2010 12:26 AM, Peter Hutterer wrote:
> >On Wed, Jun 02, 2010 at 04:40:34PM +0200, Carlos Garnacho wrote:
> >>I've been discussing with Peter Hutterer about the convenience of the
> >>"touchpoints as multiple valuators" approach, and how it could (IMHO)
> >>delay adoption in the short/mid term for anything related to multitouch.
> >
> >[...]
> >
> >>=The proposal=
> >>
> >>         The multitouch capable hw device would have a main device
> >>         created, which is able to send core events and be attached to a
> >>         MD, the evdev driver would also create several floating devices
> >>         (one for each touchpoint), unable to send core events nor to be
> >>         attached to a MD (I've disabled XI86_POINTER_CAPABLE for these,
> >>         but the server doesn't seem to honor that).
> >>
> >>         The only purpose for the main device would be routing events for
> >>         one of the floating touchpoints. Whenever a new touch happens,
> >>         and the main device isn't already routing events from another
> >>         touch, the events that such touchpoint generates would be sent
> >>         through the main device instead.
> >>
> >>         This means that there would be N+1 devices for N touchpoints, so
> >>         at least 1 of these devices wouldn't be sending events, this
> >>         makes touchpoints somewhat anonymous for multitouch purposes,
> >>         but the routed touchpoint would remain constant as long as it's
> >>         operating on the device (press ->  ... ->  release). This also
> >>         provides sane backwards compatibility, non-XI2 clients would
> >>         just see core events from the main device.
> >>
> >>         I've been experimenting with this concept, and together with a
> >>         ~200LOC patch to GTK+ master (master is already XI2 capable)
> >>         I've got things working out of the box, also wrt hotplugging.
> >>
> >>=The code=
> >>
> >>         http://cgit.freedesktop.org/~carlosg/xf86-input-evdev/log/?h=multitouch-subdevs
> >>
> >>         I've started off Benjamin's multitouch-subdevs branch for this
> >>         proof of concept.
> >>
> >>Ideas? comments?
> >
> >some more background here for others because some of the talks were on
> >private email:
> >
> >I've continuously failed to get a multitouch proposal together where touch
> >points act like pointers to clients. It always runs into the same walls with
> >the biggest one being that the transitive nature of a touchpoint is quite
> >imcompatible with many of the core protocol's assumptions.
> >
> >For example, the only way the X server can legally break a grab is by
> >unmapping the window. That, combined with the race conditions exposed by a
> >client delayingly grabbing a pointer that's not even there anymore make it
> >rather hard.
> >
> >One of the reasons the current approach with stuffing MT data into valuators
> >was picked was because it is implementable right now and at least had some
> >positive reception.
> >
> >Recently, I changed my requirements and figured that we may not need to have
> >MT core event support in the protocol but rather leave this up to the
> >toolkits. So instead of having core events from MT devices we send MT events
> >down the wire and the MT-aware toolkit converts those into the required
> >callbacks.
> >
> >When I asked Carlos about this, he had already started the work above which
> >overlaps to a large degree (though his implementation is different for
> >technical reasons).
> >
> >The main concept that I think we might need eventually here is twofold:
> >- A new device type (let's call it "Direct Input Device", DID) that does not
> >   require the abstraction between physical and virtual input device that we
> >   have with the MD/SD hierarchy. Unlike a mouse, where you interact on the
> >   physical device is where you want the interaction to happen.
> >
> >- DID's are non-core devices that are _not_ core devices and thus only send
> >   XI2 events. This allows them to be transient with the protocol crafted
> >   around their requirements.
> >
> >The first DID could act like MD and thus send core events, leaving
> >rudimentary single-touch capabilities. Because core falls away, we can
> >sidestep the grab handling on the whole lot.
> >What's not sorted out yet is sane keyboard handling, it most likely requires
> >the introduction of touch groups that share input focus between multiple
> >DIDs and of course keyboard would then need to be attached to DIDs instead
> >of SD's making it interesting for XI 2.0 clients.
> >
> >Carlos' implementation gets us similar effects already by using slave
> >devices instead of a new type of devices but especially when we think about
> >new event types I think having DIDs might be the long-term solution.
> >
> >So yeah, I'm rather optimistic about this approach though there are some
> >issues yet to be solved. If you want to chime in, please do so.
> >(Also, no code exists yet, this is just hot air from my side so far)
> >
> >Cheers,
> >   Peter
> 
> So are you saying you actually want to be able to subscribe to
> events from a mt finger in window that's next to the window with the
> rest of the fingers?  Is that really a good idea?

I think so. Prime example is the two index fingers to move two different
objects. The user is unlikely to think of it as clustering, even though the
proximity may be the same as two fingers of the same hand.

The clustering you describe below achieves the same thing, except that I
think any touchpoint should be able be the base of such a cluster (or touch
group, as I called it. AFAICT, these two terms mean the same here)

> Perhaps I should clarify my current understanding and thoughts.
> 
> I thought we were talking about having a pointer with a single
> conventional group/cluster position.  In this model, fingers show up
> as mt positions annotated on the pointer, and the cluster position
> may or may not be the position of one of those fingers.  I see that
> cluster position both as focus control for the mt contact set as
> well as a way to use mt as a conventional pointer.
> 
> I see the selection of cluster position as a bit of an arbitrary
> implementation detail (or even a normal option to accommodate
> differing preferences).  Four ideas come to mind:
> 1.  eldest active contact
> 2.  first contact (cluster doesn't move if that first finger stays off the sensor)
> 3.  geometric center
> 4.  completely independent.
> 
> Consider the magic mouse for a moment.  It has a real conventional
> pointing device and an independent mt surface.  I think that clients
> that want to subscribe to finger positions on that touch surface
> should see them as some how related to the position of the
> conventional pointer.

why? shouldn't touch and pointer on this device be independent, only locked
together if the client wants it? I don't think the server should enforce the
pairing here.

> I think we should eventually talk about supporting mt from a one or
> more sensors in multiple windows/locations.  But I would like to
> think in terms of spatial clustering.  For example, we can cluster
> based on the size of a hand.  Each hand gets a core pointer with one
> or more fingers annotated.
> 
> 
> As for sub-sub-devices vs. valuators, that doesn't matter all that
> much to me. And I think if you establish the meaning, it shouldn't
> really matter to all that many people.  If you have a clean concept
> down, then it won't change the client side code all that much if you
> switch from one to the other.

Clustering or not matters once you start tracking touchpoints individually.
If a core client has a grab on the pointer, does this mean that all
touchpoints in the cluster stop sending events to other clients? Or that
just that base of the cluster is grabbed and the rest is still free for
other clients?
With the current valuator approach, we're locked in to only ever allowing
one client to access the multitouch data. Long term, I don't think that's
good.

> I will say if you're talking about a device hierarchy, I think you
> already have 1 distinction too many.  I think as far as the client
> side interface is concerned, an input should be an input.  The input
> should be free to support children, if desired (the Virtual core
> pointer being an example of a children fan-in driver).  In that
> light, the individual fingers are a new special type of device, they
> are just children of a node in the hierarchy.  I suppose that will
> get slightly messier when we start worrying about multiple core
> pointers for a single MT sensor, but I don't think this point of
> view makes that any trickier than the alternatives.

Can we please start worrying about this now? because this is the really
tricky case, and unless we manage to sort out attachment, keyboard pairing
and grab behaviour, we're only likely to curse ourselves a year down the
road.

I think touch points are different to pointers, especially long term when we
want more info than just x/y. The init/move/destroy nature makes them more
like permanent dragging devices, but even that's a stretch.
So we need to make them co-exist with the current MD/SD hierarchy. In it is
tricky because of the transient nature of the touchpoints. Outside of it
(like floating slaves, as Carlos implemented it) is closer but not quite
there - there is e.g. no way to pair a keyboard with a floating slave.
Hence the need for a new device type. Attaching them to an existing MD at
the same level as a mouse makes things tricky, you now need to synchronise
the state of all attached touchpoints with the state of the MD, including
the state of other potential SDs like mice.

Can you outline your approach in more detail please? Because unless I have
that, I can't really start thinking about the details.

Cheers,
  Peter