multitouch

Sun Feb 7 22:16:35 PST 2010

my apologies for the late answer to this whole thing, but this is sort-of a
reply to all three emails by you guys.

On Tue, Jan 19, 2010 at 01:00:27PM +0100, Simon Thum wrote:
> Bradley T. Hughes wrote:
> > On 01/18/2010 11:54 PM, ext Carsten Haitzler (The Rasterman) wrote:
> >> hey guys (sorry for starting a new thread - i only just subscribed - lurking on
> >> xorg as opposed to xorg-devel).
> >>
> >> interesting that this topic comes up now... multitouch. i'm here @ samsung and
> >> got multi-touch capable hardware - supports up to 10 touchpoints, so need
> >> support.
> >>
> >> now... i read the thread. i'm curious. brad - why do u think a single event (vs
> >> multiple) means less context switches (and thus less power consumption, cpu
> >> used etc.)?
> > 
> > Even though the events may be buffered (like you mention), there's no 
> > guarantee that they will fit nicely into the buffer. I'm not say that this 
> > will always be the case, but I can foresee the need to write code that scans 
> > the existing event queue, possibly flushes and rereads, scans again, etc. to 
> > ensure that the client did actually get all of the events that it was 
> > interested in.
> They other guys at my workplace do a touchtable, so I'm not particularly
> qualified. But there's 10 points in carsten's HW already, and from what
> I know it's not hard to imagine pressure or what not to become
> important. That's 30 axes and a limit of 36 axes - if that's not easy to
> lift I'd be wary of such an approach.

the 36 axis limit is one defined in XI1. arguably, no sane multi-touch
application should be using XI1 anyway. XI2 has a theoretical 16-bit limit
on axis numbers, so that should be sufficient for devices in the near
future. Yes, there are some limitations in the server but they can be fixed.

> > There's also the fact that the current approach that Benjamin suggested 
> > requires an extra client to manage the slave devices.
> 
> OTOH, if you're getting serious, there needs to be an instance
> translating events into gestures/metaphors anyway. So I don't see the
> point of avoiding an instance you're likely to need further on.

A gesture recogniser instance will be mandatory. However, a client that
modifies the list of input devices on demand and quite frequently hopefully
won't. Benjamin's approach puts quite a load on the server and on all
clients (presence events are sent to every client), IMO unnecessarily.

The basic principle for the master/slave division is that even in the
presence of multiple physical devices, what really counts in the GUI is the
virtual input points. This used to be a cursor, now it can be multiple
cursors and with multitouch it will be similar. Most multitouch gestures
still have a single input point with auxiliary information attach.
Prime example is the pinch gesture with thumb and index - it's not actually
two separate points, it's one interaction. Having two master devices for
this type of gesture is overkill. As a rule of thumb, each hand from each
user usually constitutes an input point and thus should be represented as a
master device. 

So all we need is hardware that can tell the difference between hands :)

An example device tree for two hands would thus look like this:

MD1- MD XTEST device
   - physical mouse
   - right hand touch device - thumb subdevice
                             - index subdevice
MD2- MD XTEST device
   - physical trackball
   - left hand touch device  - thumb subdevice
                             - index subdevice
                             - middle finger subdevice

Where the subdevices are present on demand and may disappear. They may not
even be actual devices but just represented as flags in the events.
The X server doesn't necessarily need to do anything with the subdevices.
What the X server does need however is the division between the input points
so it can route the events accordingly. This makes it possible to pinch in
one app while doing something else in another app (note that I am always
thinking of the multiple apps use-case, never the single app case).

When I look at the Qt API, it is device-bound so naturally the division
between the devices falls back onto X (as it should, anyway).
The tricky bit about it is - at least with current hardware - how to decide
how many slave devices and which touchpoints go into which slave device.
Ideally, the hardware could just tell us but...

this approach works well for mouse emulation too, since the first subdevice
on each touch device can be set to emulate mouse events. what it does lead
to is some duplication in multi-pointer _and_ multi-touch aware applications
though, since they have to be able to differ between the two.

until the HW is ready to at least tell the driver what finger is touching,
etc., the above requires a new event to label the number of subdevices and
what information is provided. This would be quite similar to Qt's
QTouchEvent::TouchPoint class and I believe close enough to Window's
approach?

I'm still somewhat opposed to sending the extra data as valuators. While
it's a short-term fix it's a kludge as it lacks some information such as
when touchpoints appear/disappear. This again can be hacked around, but...

> >> but - i do see that if osx and windows deliver events as a single blob for
> >> multiple touches, then if we do something different, we are just creating work
> >> for developers to adapt to something different. i also see the arguument for
> >> wanting multiple valuators deliver the coords of multiple fingers for things
> >> like pinch, zoom, etc. etc. BUT this doesnt work for other uses - eg virtual
> >> keyboard where i am typing with 2 thumbs - my presses are actually independent
> >> presses like 2 core pointers in mpx.
> >  >
> >> so... i think the multiple valuators vs multiple devices for mt events is moot
> >> as you can argue it both ways and i dont think either side has specifically a
> >> stronger case... except doing multiple events from multiple devices works
> >> better with mpx-aware apps/toolkits, and it works better for the more complex
> >> touch devices that deliver not just x,y but x, y, width, height, angle,
> >> pressure, etc. etc. per point (so each point may have a dozen or more valuators
> >> attached to it), and thus delivering a compact set of points in a single event
> >> makes life harder for getting all the extra data for the separate touch events.
> > 
> > Indeed. There are cases where one is more convenient over the other and vice 
> > versa. This is what we struggled with for a while when doing the Qt API for 
> > multi-touch. In the end, we went with the single blob approach and tag each 
> > point in the blob with pressed/moved/released state (so that it's possible 
> > to cover both use cases).
> > 
> > The only thing that concerns me with the idea of sending each touch point as 
> > a separate device is that it
> > 
> >> so i'd vote for how tissoires did it as it allows for more information per
> >> touch point to be sanely delivered. as such thats how we have it working right
> >> now. yes - the hw can deliver all points at once but we produce n events. but
> >> what i'm wondering is.. should we....
> >>
> >> 1. have 1, 2, 3, 4 or more (10) core devices, each one is a touch point.
> >> 2. have 1 core with 9 slave devices (core is first touch and core pointer)
> >> 3. have 1 core for first touch and 9 floating devices for the other touches.
> >>
> >> they have their respective issues. right now we do #3, but #2 seems very
> >> logical. #1 seems a bit extreme.
> > 
> > I agree, #1 sounds a bit extreme. An approach like 2 or 3 is also doable.

as I said above, the issue isn't quite as simple and it should scale up to
the use-case of 2 users with 3 hands on the table, interacting with two
different applications. So while #2 is the most logical, the number of
master devices needs to equal the number of virtual input points the users
want. and that's likely to be one per hand.

> >> remember - need to keep compatibility with single touch (mouse only) events and
> >> apps as well as expand to be able to get the multi-touch events if wanted.
> > 
> > Exactly. Do #2 and #3 keep that compatibility? My understanding is that if 
> > we did #2, then the master pointer would still deliver events for all slaces 
> > (with DeviceChanged events mixed in between). Couldn't this confuse 
> > non-mulit-touch and/or non-mpx aware clients?

non-MPX aware applications wouldn't see the DeviceChanged events, they'd
just see a really fast moving device. Similar to what happens now if you'd
use two single-point touchscreens at the same time. non-multitouch (i.e.
MPX) applications must be able to cope with it, there's nothing special
about it, it's the same as using a mouse and a touchpad at the same time.

Cheers,
  Peter