multitouch

Mon Feb 8 01:23:53 PST 2010

On Mon, 8 Feb 2010 16:16:35 +1000 Peter Hutterer <peter.hutterer at who-t.net>
said:

> my apologies for the late answer to this whole thing, but this is sort-of a
> reply to all three emails by you guys.
> 
> On Tue, Jan 19, 2010 at 01:00:27PM +0100, Simon Thum wrote:
> > Bradley T. Hughes wrote:
> > > On 01/18/2010 11:54 PM, ext Carsten Haitzler (The Rasterman) wrote:
> > >> hey guys (sorry for starting a new thread - i only just subscribed -
> > >> lurking on xorg as opposed to xorg-devel).
> > >>
> > >> interesting that this topic comes up now... multitouch. i'm here @
> > >> samsung and got multi-touch capable hardware - supports up to 10
> > >> touchpoints, so need support.
> > >>
> > >> now... i read the thread. i'm curious. brad - why do u think a single
> > >> event (vs multiple) means less context switches (and thus less power
> > >> consumption, cpu used etc.)?
> > > 
> > > Even though the events may be buffered (like you mention), there's no 
> > > guarantee that they will fit nicely into the buffer. I'm not say that
> > > this will always be the case, but I can foresee the need to write code
> > > that scans the existing event queue, possibly flushes and rereads, scans
> > > again, etc. to ensure that the client did actually get all of the events
> > > that it was interested in.

even then u;'d be begging for bugs if you handle events massively out-of-order
(eg several mouse moves/downs/ups between xi2 events). anyway.... not sure
power here is a good argument - there are other ones that are better :)

> > They other guys at my workplace do a touchtable, so I'm not particularly
> > qualified. But there's 10 points in carsten's HW already, and from what
> > I know it's not hard to imagine pressure or what not to become
> > important. That's 30 axes and a limit of 36 axes - if that's not easy to
> > lift I'd be wary of such an approach.
> 
> the 36 axis limit is one defined in XI1. arguably, no sane multi-touch
> application should be using XI1 anyway. XI2 has a theoretical 16-bit limit
> on axis numbers, so that should be sufficient for devices in the near
> future. Yes, there are some limitations in the server but they can be fixed.

good to hear :)

> > > There's also the fact that the current approach that Benjamin suggested 
> > > requires an extra client to manage the slave devices.
> > 
> > OTOH, if you're getting serious, there needs to be an instance
> > translating events into gestures/metaphors anyway. So I don't see the
> > point of avoiding an instance you're likely to need further on.
> 
> A gesture recogniser instance will be mandatory. However, a client that
> modifies the list of input devices on demand and quite frequently hopefully
> won't. Benjamin's approach puts quite a load on the server and on all
> clients (presence events are sent to every client), IMO unnecessarily.

why should one be at the xi2 event level? i'm dubious of this. i've thought it
through a lot - you want gesture recognition happening higher up in the toolkit
or app. you need context - does that gesture make sense. if one gesture was
started but it ended in a way that gesture changed, u ned to cancel the
previous action etc. imho multitouch etc. should stick to delivering as much
info that the HW provides as cleanly and simply as possible via xi2 with
minimal interruption of existing app functionality.

> The basic principle for the master/slave division is that even in the
> presence of multiple physical devices, what really counts in the GUI is the
> virtual input points. This used to be a cursor, now it can be multiple
> cursors and with multitouch it will be similar. Most multitouch gestures
> still have a single input point with auxiliary information attach.
> Prime example is the pinch gesture with thumb and index - it's not actually
> two separate points, it's one interaction. Having two master devices for
> this type of gesture is overkill. As a rule of thumb, each hand from each
> user usually constitutes an input point and thus should be represented as a
> master device.

well that depends - if i take both my hands with 2 fingers and now i draw thins
with both left and right hand.. i am using my hands as 2 independent core
devices. the problem is - the screen can't tell the difference - neither can
the app. i like 2 core devices - it means u can emulate multitouch screens
with mice... you just need N mice for N fingers. :) this is a good way to
encourage support in apps and toolkits as it can be more widely used.

> So all we need is hardware that can tell the difference between hands :)

aaah we can wish :)

> An example device tree for two hands would thus look like this:
> 
> MD1- MD XTEST device
>    - physical mouse
>    - right hand touch device - thumb subdevice
>                              - index subdevice
> MD2- MD XTEST device
>    - physical trackball
>    - left hand touch device  - thumb subdevice
>                              - index subdevice
>                              - middle finger subdevice
> 
> Where the subdevices are present on demand and may disappear. They may not
> even be actual devices but just represented as flags in the events.
> The X server doesn't necessarily need to do anything with the subdevices.
> What the X server does need however is the division between the input points
> so it can route the events accordingly. This makes it possible to pinch in
> one app while doing something else in another app (note that I am always
> thinking of the multiple apps use-case, never the single app case).

well this assumes u can tell the difference between 2 hands... :)

> When I look at the Qt API, it is device-bound so naturally the division
> between the devices falls back onto X (as it should, anyway).
> The tricky bit about it is - at least with current hardware - how to decide
> how many slave devices and which touchpoints go into which slave device.
> Ideally, the hardware could just tell us but...

well 2nd, 3rd, 4th etc. fingers for 1 hand would go in as slaves no?

> this approach works well for mouse emulation too, since the first subdevice
> on each touch device can be set to emulate mouse events. what it does lead
> to is some duplication in multi-pointer _and_ multi-touch aware applications
> though, since they have to be able to differ between the two.
> 
> until the HW is ready to at least tell the driver what finger is touching,
> etc., the above requires a new event to label the number of subdevices and
> what information is provided. This would be quite similar to Qt's
> QTouchEvent::TouchPoint class and I believe close enough to Window's
> approach?

well the hw here knows 1st and 2nd etc. finger, if i release my 1st finger and
keep 2nd down - 2nd reports events, 1st doesnt. so it knows the order of
touches - and keeps that synced to the points. but knowing if its thumb or
index or pinky or the other hand etc. - can't find out.

> I'm still somewhat opposed to sending the extra data as valuators. While
> it's a short-term fix it's a kludge as it lacks some information such as
> when touchpoints appear/disappear. This again can be hacked around, but...

yeah. i agreee. has good points, but down sides too. separate devices (slaves)
for the extra touches seems good to me - first touch is a core device and maps
well to existing singe-touch screens and mice.

> > >> but - i do see that if osx and windows deliver events as a single blob
> > >> for multiple touches, then if we do something different, we are just
> > >> creating work for developers to adapt to something different. i also see
> > >> the arguument for wanting multiple valuators deliver the coords of
> > >> multiple fingers for things like pinch, zoom, etc. etc. BUT this doesnt
> > >> work for other uses - eg virtual keyboard where i am typing with 2
> > >> thumbs - my presses are actually independent presses like 2 core
> > >> pointers in mpx.
> > >  >
> > >> so... i think the multiple valuators vs multiple devices for mt events
> > >> is moot as you can argue it both ways and i dont think either side has
> > >> specifically a stronger case... except doing multiple events from
> > >> multiple devices works better with mpx-aware apps/toolkits, and it works
> > >> better for the more complex touch devices that deliver not just x,y but
> > >> x, y, width, height, angle, pressure, etc. etc. per point (so each point
> > >> may have a dozen or more valuators attached to it), and thus delivering
> > >> a compact set of points in a single event makes life harder for getting
> > >> all the extra data for the separate touch events.
> > > 
> > > Indeed. There are cases where one is more convenient over the other and
> > > vice versa. This is what we struggled with for a while when doing the Qt
> > > API for multi-touch. In the end, we went with the single blob approach
> > > and tag each point in the blob with pressed/moved/released state (so that
> > > it's possible to cover both use cases).
> > > 
> > > The only thing that concerns me with the idea of sending each touch point
> > > as a separate device is that it
> > > 
> > >> so i'd vote for how tissoires did it as it allows for more information
> > >> per touch point to be sanely delivered. as such thats how we have it
> > >> working right now. yes - the hw can deliver all points at once but we
> > >> produce n events. but what i'm wondering is.. should we....
> > >>
> > >> 1. have 1, 2, 3, 4 or more (10) core devices, each one is a touch point.
> > >> 2. have 1 core with 9 slave devices (core is first touch and core
> > >> pointer)
> > >> 3. have 1 core for first touch and 9 floating devices for the other
> > >> touches.
> > >>
> > >> they have their respective issues. right now we do #3, but #2 seems very
> > >> logical. #1 seems a bit extreme.
> > > 
> > > I agree, #1 sounds a bit extreme. An approach like 2 or 3 is also doable.
> 
> as I said above, the issue isn't quite as simple and it should scale up to
> the use-case of 2 users with 3 hands on the table, interacting with two
> different applications. So while #2 is the most logical, the number of
> master devices needs to equal the number of virtual input points the users
> want. and that's likely to be one per hand.

but we have a problem now... we only have master and slave. we need N levels. i
need on a collaborative table:

        person
         /   \
       hand hand
      / | | | | \
finger /  | |  \ finger
 finger   | |   finger
     finger finger

in the end... n levels is likely going to be needed. we can flatten this sure,
but in the end you will not be able to anymore. :(

> > >> remember - need to keep compatibility with single touch (mouse only)
> > >> events and apps as well as expand to be able to get the multi-touch
> > >> events if wanted.
> > > 
> > > Exactly. Do #2 and #3 keep that compatibility? My understanding is that
> > > if we did #2, then the master pointer would still deliver events for all
> > > slaces (with DeviceChanged events mixed in between). Couldn't this
> > > confuse non-mulit-touch and/or non-mpx aware clients?
> 
> non-MPX aware applications wouldn't see the DeviceChanged events, they'd
> just see a really fast moving device. Similar to what happens now if you'd
> use two single-point touchscreens at the same time. non-multitouch (i.e.
> MPX) applications must be able to cope with it, there's nothing special
> about it, it's the same as using a mouse and a touchpad at the same time.

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster at rasterman.com