[RFC] Multitouch support, step one

Mon Mar 15 04:55:11 PDT 2010

Peter, I did not found the time last week to respond, so I will do now

Le 15/03/2010 09:50, Peter Hutterer a écrit :
> On Mon, Mar 15, 2010 at 06:36:15PM +1100, Carsten Haitzler wrote:
>> On Mon, 15 Mar 2010 16:56:05 +1000 Peter Hutterer<peter.hutterer at who-t.net>
>> said:
>>
>> cool - no comment here means "ok" :) so just comments on the questioney bits.
>>
>>> Alrighty, I've been thinking some more about multitouch and here's my
>>> current proposal:
>>>
>>> Good news first - we can probably make it work.
>>> Bad news second - not quite just yet, not without kludges.
>>>
>>> Preamble:
>>> Multi-touch as defined in this proposal is limited to single input-point
>>> multi-touch. This is suitable for indirect touch devices (e.g. touchpads)
>>> and partially suited for direct touch devices provided a touch is equivalent
>>> to a single-gesture single-application input.
>>>
>>> "true" multitouch, i.e. multiple independent input points across multiple
>>> client is not covered here, at this point this problem is unsolved.
>>> The trick is to get us the former, without limiting future use of the
>>> latter.

And I think this would be very a good point for linux to say that we can 
have real multitouch with different application.

>>>
>>> Disclaimer:
>>> I believe this is pretty much what Win 7 or OS X do so I won't bother
>>> claiming this being innovative. This isn't exactly my idea either, I'm just
>>> writing up what I got from talking to Benjamin, Bradley, Henrik, Stephane,
>>> and many more.
>>>
>>> Details:
>>> The data we get from the (Linux) kernel includes essentially all the ABS_MT
>>> events, x, y, w, h, etc. We can pack this data into valuators on the device.
>>> In the simplest case, a device with two touchpoints would thus send 4
>>> valuators - the first two being the coordinate pair for the first touch
>>> point, the latter two the coordinates for the second touch point.

As Carsten said, we could send all the MT-valuators for each touch. 
Thus, having the description of valuators as:
Val0_not_mt, Val1_not_mt, ... Val_i_not_mt, TrackingID0, ABS_MT_X0, 
ABS_MT_Y0, other_mt_0, TrackingID1, ABS_MT_X1, .... etc

>>>
>>> XI2 provides us with axis labels, so we can label the axes accordingly.
>>> Clients that don't read axis labels are left guessing what the fancy values
>>> mean, which is exactly what they're doing already anyway.
>>
>> ok. here is where i ask.. what are these labels to be?pointless knowing there
>> are labels - unless we know what they should be to indicate what is what. it's
>> still guessing if we don't know what they should be :)
>
> See Benjamin's commit a34812b09000db2ff2a1dc6182602839123edd4e on master.
> The idea is that your 2-touchpoint device provides the following labels,
> "Abs X", "Abs Y", "Abs MT Position X", "Abs MT Position Y", "Abs MT Position
> X", "Abs MT Potision Y".

agree, but remember to send the other axes.

>
> Having said that, I now realise that it's hard to tell them apart this way,
> you can only look for repetition in axes. This is a fair assumption I guess,
> I'm not sure if any devices have different capabilities on different
> touchpoints.
>

I never seen such a device, and I don't understand how could it be. The 
most strange device I get is the Apple Magic Mouse, which is both 
relative and absolute, but all the touch points are the same.

By the way I don't think that the kernel could handle such a device as 
it serializes the valuators, and if a touch does not report an axe, it 
will have its value at 0.

>>> XI2 DeviceEvents provide a bitmask for the valuators present in a device.
>>> Hence, a driver can dynamically add and remove valuators from events, thus
>>> providing information about the presence of these valuators.
>>> e.g. DeviceEvent with valuators [1-4] means two touchpoints down, if the
>>> next event only includes valuators [3-4], the first touchpoint has
>>> disappeared.

I made a few test yesterday (involuntary though) with the magic mouse. 
As we can send only part of the valuators, why not packing the different 
available touches at the beginning. It will require to move some 
valuators, but if we keep sending the trackingID, it won't hurt the client.

An other point for keeping the valuator trackingID. Some device (stantum 
and magicmouse) send a trackingID different than touch point. i.e. the 
trackingID is between 1 and 255 on the stantum, and between 1 and 16 on 
the magic mouse. I don't know if the user app should use this, or a 
higher level id could do the job.

>>>
>>> Core requires us to always send x/y, hence for core emulation we should
>>> always include _some_ coordinates that are easily translated. While the
>>> server does caching of absolute values, I think it would be worthwile to
>>> always have an x/y coordinate _independent of the touchpoints_ in the event.
>>> The driver can decide which x/y coordinates are chosen if the first
>>> touchpoint becomes invalid.
>>
>> hmm so ok. i press 1 finger down - i get a xi2 event with N*2 valuators (N
>> being the maximum # of touch points supported lets say). bitmask tells me which
>> are "active". what about core events? i press 1 finger down - i'll get a xi2
>> event AND a core event.
>
> Any event is only ever sent as _either_ XI, XI2 or core. Besides, are you
> listening for core events and XI2 at the same time? If so, why? The idea of
> XI2 was that once a client acknowledges XI2 exists, core events should be a
> thing of the past for this client. If it's listening to both it'll have to
> magically find out which events are caused by the same device.
>
>> how do i know this for sure? how do i know even though
>> xi2 claims to have an input device for mt input - that it may not send events
>> (it's used differently). now i press a 2nd finger - no problems. core events
>> and xi2 events keep coming - now i release my first finger. what happens to
>> core events. from the above i gather that the driver will keep sending core
>> events now - but for the remaining pressed finger. right?
>
> nearly correct, yes. Drivers don't send core events. They send events, and
> some of these events may be delivered as core events depending on the
> position of the pointer, etc. A driver has no say over what is a core event
> or not (this has changed with 1.6 but even before that it wasn't strictly
> enforcable).
>
> Your example is why I suggested to always add x/y to the touch points, the
> driver is the most likely thing to know which point should be chosen when
> the first finger leaves the area. This may be either the second touchpoint
> or the logical centre of the remaining touchpoints, or...
>
>
>>> Hence, the example with 4 valuators above becomes a device with 6 valuators
>>> instead. x/y and the two coordinate pairs as mentioned above. If extra data
>>> is provided by the kernel driver, these pairs are simple extended into
>>> tuples of values, appropriately labeled.
>>>
>>> Core clients will ignore the touchpoints and always process the first two
>>> coordinates.
>>
>> first 2 that are active of first 2 - if not active then no core events?
>
> whatever the driver forwarded as x/y. If the driver doesn't forward it, in
> the extreme case this may be the x/y of the _previous_ position of the first
> finger (the server caches it anyway).
> given a driver sending data for touchpoints 2 and 3 only, the data in the
> event would then be:
> last_core_x, last_core_y, nil, nil, tp2_x, tp2_y, tp3_x, tp3_y
>
>
>>> XI1 clients will have to guess what the valuators mean or manually set it up
>>> in the client.
>>> XI2 clients will automagically work since the axes are labeled. Note that
>>> any client that receives such an event always has access to _all_
>>> touchpoints on the device. This works fine for say 4-finger swipes on a
>>> touchpad but isn't overly useful for the multiple client case, see
>>> above.
>>> Since additional touchpoints are valuators only, grabs work as if the
>>> touches belong to a single device. If any client grabs this device, the
>>> others will miss out on the touchpoints.
>>
>> aaah but as above.. no automagic working without knowing these labels :)
>>
>>> XI2 allows devices to change at runtime. Hence a device may add or remove
>>> valuators on-the-fly as touchpoints appear and disappear. There is a chance
>>> of a race condition here. If a driver decides to add/remove valuators
>>> together with the touchpoints, a client that skips events may miss out.
>>> e.g. if a DeviceChanged event that removes an axis is followed by one that
>>> adds an axis, a client may only take the second one as current, thus
>>> thinking the axis was never removed. There is nothing in the XI2 specs that
>>> prohibits this. Anyways, adding removing axes together with touchpoints
>>> seems superfluous if we use the presence of an axis as indicator for touch.
>>> Rather, I think a device should be set up with a fixed number of valuators
>>> describing the default maximum number of touchpoints. Additional ones can be
>>> added at runtime if necessary.
>>
>> agreed. i really see this having a fixed # of touch points - and not changing -
>> unless you literally unplug/plug in new hardware that has different features
>> (has more or less in the way of touch point support).

We can have a fixed number of touch point but send only the required 
ones. So agreed too. The point is: how many touch point do we have. The 
kernel knows how many touches a device can send as the data are not 
serialized. But after that, we have no idea of how many touches the 
device support.

With the mask system (or the packing of the touches at the beginning), 
we will send only the right number of touches, but the description will 
be very heavy. If each point has 5 axes (trackingID, x, y, width, height 
for instance) we will have 50 valuators if we support 10 touches ;-) By 
the way, it's not the point here.

>>
>>> Work needed:
>>> - drivers: updated to parse ABS_MT_FOO and forward it on.
>>> - X server: the input API still uses the principle of first + num_valuators
>>>    instead of the bitmask that the XI2 protocol uses. These calls need to be
>>>    added and then used by the drivers.
>>> - Protocol: no protocol changes are necessary, though care must be taken in
>>>    regards to XI1 clients.
>>>    Although the XI2 protocol does allow device changes, this is not specified
>>>    in the XI1 protocol, suggesting that once a device changes, potential XI1
>>>    clients should be either ignored or limited to the set of axes present
>>>    when they issued the ListInputDevices request. Alternatively, the option
>>>    is to just encourage XI1 clients to go the way of the dodo.
>>>
>>> Corner cases:
>>> We currently have a MAX_VALUATORS define of 32. This may or may not be
>>> arbitrary and interesting things may or may not happen if we increase that.
>>
>> another problem - no ability to do "pressure" here. ie have each touch point
>> have a radius for example (x and y radius) etc. etc. ??? what happened to that?
>
> The kernel's MT API caters for width/height, orientation and a few other
> things (see linux/input.h, essentiall, we're just mirroring here anyway).
> what it doesn't cater for yet is MT pressure though IIRC I've either seen a
> patch float past or at least the talk about it. Since we only need to add
> another label, that's easy enough. But good point, we mustn't forget this.

Currently, we don't aim at modifying the data the device send. If it 
provides a pressure (starting from kernel 2.6.33 I think), we should 
transfer it to the client. But, I don't think we should create an 
arbitrary value depending on sizeX and sizeY.

>
> I think this answers all of your questions, let me know if I skipped
> something.
>
> Cheers,
>    Peter
>
>>> A device exposing several axes _and_ multitouch axes will need to be
>>> appropriately managed by the driver. In this case, the "right" thing to do
>>> is likely to expose non-MT axes first and tack the MT axes onto the back.
>>> Some mapping may need to be added.
>>

agree (see above)

>> you mean axes like each touchpoint width/hight (radius) etc. ?
>
>
>>> The future addition of real multitouch will likely require protocol changes.
>>> These changes will need to include a way of differentiating a device that
>>> does true multitouch from one that does single-point multi-touch.
>>>
>>> That's it, pretty much (well, not much actually). Feel free to poke holes
>>> into this proposal.
>>
>> *poke* :)
>

Cheers,
Benjamin