[RFC] Multitouch support, step one

Simon Thum simon.thum at gmx.de
Mon Mar 15 09:01:48 PDT 2010


Henrik,

if I'm anywhere near understanding this, then probably your're missing
an important point. This is _step one_. Getting information to where you
need it and giving it a basic structure, roughly.

The things you're talking about are more suited as a step two: Making
sense of the information. Along those lines, you're probably right,
though I'd go even further. But ATM concensus is needed how to transport
MT information in a suitable and forward-compatible manner.

The sooner this is nailed down, the sooner more abstract concepts can be
tackled.

Cheers,

Simon

Am 15.03.2010 15:41, schrieb Henrik Rydberg:
> Hi Peter,
> 
>> Alrighty, I've been thinking some more about multitouch and here's my
>> current proposal:
>>
>> Good news first - we can probably make it work.
>> Bad news second - not quite just yet, not without kludges.
> 
> I hope this feedback will be taken the right way, as a friendly injection into
> the multitouch discussion. :-)
> 
>>
>> Preamble:
>> Multi-touch as defined in this proposal is limited to single input-point
>> multi-touch. This is suitable for indirect touch devices (e.g. touchpads)
>> and partially suited for direct touch devices provided a touch is equivalent
>> to a single-gesture single-application input.
> 
> User-space applications need tools to *use* MT devices, not route raw data from
> the devices to the application. The latter is not much more complicated than
> opening a file, and everyone can do that already. Thus, unless there is a model
> for how MT devices work and interact with other MT devices, I see little point
> in having an X protocol at all.
> 
> 
>> Details:
>> The data we get from the (Linux) kernel includes essentially all the ABS_MT
>> events, x, y, w, h, etc. We can pack this data into valuators on the device.
>> In the simplest case, a device with two touchpoints would thus send 4
>> valuators - the first two being the coordinate pair for the first touch
>> point, the latter two the coordinates for the second touch point.
>>
>> XI2 provides us with axis labels, so we can label the axes accordingly.
>> Clients that don't read axis labels are left guessing what the fancy values
>> mean, which is exactly what they're doing already anyway.
> 
> The idea of a wide set of dimensions to describe a set of fingers for instance,
> was considered and dropped for the kernel MT interface. There is a definite
> difference between having "three things" and having "two more of the same kind".
> The number of dimensions also increases dramatically, as pointed out by Mr.
> Poole. It makes much more sense to define contacts as multiple instances of the
> same thing, than to define each new contact as potentially something completely
> different.
> 
> 
>> XI2 DeviceEvents provide a bitmask for the valuators present in a device.
>> Hence, a driver can dynamically add and remove valuators from events, thus
>> providing information about the presence of these valuators.
>> e.g. DeviceEvent with valuators [1-4] means two touchpoints down, if the
>> next event only includes valuators [3-4], the first touchpoint has
>> disappeared.
> 
> The idea of adding and removing contacts dynamically I believe is a good idea. A
> contact has a set of attributes (x, y, etc). Why not provide a clean interface
> for the contacts as a concept, rather than mapping the not-so-independent x and
> y values into separate dynamic entities? As an example of the smallest
> meaningful dynamic entity:
> 
> struct Contact {
> 	int tracking_id;
> 	float x, y;
> 	etc etc...
> };
> 
>> Core requires us to always send x/y, hence for core emulation we should
>> always include _some_ coordinates that are easily translated. While the
>> server does caching of absolute values, I think it would be worthwile to
>> always have an x/y coordinate _independent of the touchpoints_ in the event.
>> The driver can decide which x/y coordinates are chosen if the first
>> touchpoint becomes invalid.
> 
> Seconded, but the single-touch x/y coordinates are properties of a contact
> group, not of a single contact. Example:
> 
> struct ContactGroup {
> 	int group_id;
> 	float x, y;
> 	ContactList list;
> 	etc etc...
> };
> 
>> Hence, the example with 4 valuators above becomes a device with 6 valuators
>> instead. x/y and the two coordinate pairs as mentioned above. If extra data
>> is provided by the kernel driver, these pairs are simple extended into
>> tuples of values, appropriately labeled.
>>
>> Core clients will ignore the touchpoints and always process the first two
>> coordinates.
>> XI1 clients will have to guess what the valuators mean or manually set it up
>> in the client.
>> XI2 clients will automagically work since the axes are labeled. Note that
>> any client that receives such an event always has access to _all_
>> touchpoints on the device. This works fine for say 4-finger swipes on a
>> touchpad but isn't overly useful for the multiple client case, see
>> above.
> 
> This is at the heart of the problem, I believe. In addition to being able to
> work with a set of ContactGroups, like ContactGroupList, one needs the
> possibility to dynamically regroup them, based on geometric information and what
> not. Partitioning is the word. A toolset consisting of at least these functions:
> 
> ContactGroupList partition_contacts_geometrically(ContactList all_contacts);
> ContactGroupList partition_contacts_by_user(ContactList all_contacts);
> ContactGroupList find_contact_groups_in_window(ContactGroupList all_groups);
> etc etc
> 
> ought to be the minimum requirement on the interface, such that applications can
> do something meaningful with the information at hand.
> 
> 
>> Since additional touchpoints are valuators only, grabs work as if the
>> touches belong to a single device. If any client grabs this device, the
>> others will miss out on the touchpoints.
>>
>> XI2 allows devices to change at runtime. Hence a device may add or remove
>> valuators on-the-fly as touchpoints appear and disappear. There is a chance
>> of a race condition here. If a driver decides to add/remove valuators
>> together with the touchpoints, a client that skips events may miss out.
>> e.g. if a DeviceChanged event that removes an axis is followed by one that
>> adds an axis, a client may only take the second one as current, thus
>> thinking the axis was never removed. There is nothing in the XI2 specs that
>> prohibits this. Anyways, adding removing axes together with touchpoints
>> seems superfluous if we use the presence of an axis as indicator for touch.
>> Rather, I think a device should be set up with a fixed number of valuators
>> describing the default maximum number of touchpoints. Additional ones can be
>> added at runtime if necessary.
> 
> Some events are, as always, more important than others. If the stream bandwidth
> is a concern, there is always the possibility to tag events as "important" and
> "less important", in the same manner as focus events normally are more important
> than mouse movement events.
> 
>>
>> Work needed:
>> - drivers: updated to parse ABS_MT_FOO and forward it on.
>> - X server: the input API still uses the principle of first + num_valuators
>>   instead of the bitmask that the XI2 protocol uses. These calls need to be
>>   added and then used by the drivers.
>> - Protocol: no protocol changes are necessary, though care must be taken in
>>   regards to XI1 clients. 
>>   Although the XI2 protocol does allow device changes, this is not specified
>>   in the XI1 protocol, suggesting that once a device changes, potential XI1
>>   clients should be either ignored or limited to the set of axes present
>>   when they issued the ListInputDevices request. Alternatively, the option
>>   is to just encourage XI1 clients to go the way of the dodo.
>>
>> Corner cases:
>> We currently have a MAX_VALUATORS define of 32. This may or may not be
>> arbitrary and interesting things may or may not happen if we increase that.
>>
>> A device exposing several axes _and_ multitouch axes will need to be
>> appropriately managed by the driver. In this case, the "right" thing to do
>> is likely to expose non-MT axes first and tack the MT axes onto the back.
>> Some mapping may need to be added.
>>
>> The future addition of real multitouch will likely require protocol changes.
>> These changes will need to include a way of differentiating a device that
>> does true multitouch from one that does single-point multi-touch.
>>
>> That's it, pretty much (well, not much actually). Feel free to poke holes
>> into this proposal.
> 
> Ok, in conclusion, my two cents are: Do not add MT functionality as evaluators
> in X, but implement a proper Contact interface from the start.
> 
> Cheers,
> Henrik
> 



More information about the xorg-devel mailing list