Multitouch followup: gesture recognition?

Simon Thum simon.thum at
Tue Mar 23 16:06:30 PDT 2010

[CC'ing Pter, see below]

Am 23.03.2010 18:42, schrieb Florian Echtler:
> Just for my understanding: when talking about a special client, you think of 
> something like a (compositing) window manager?
Yes, 'special' since it registers itself for rights (and duties) only
one client shall possess.

>> A library can be done 'right now', since apps are free to do so. It has
>> the advantage of a close connection to the consuming app, but also the
>> associated disadvantages.
>> In particular, how to cope with global gestures, e.g. switching an app
>> or backgrounding it? Apparently, such things should be consistent. I
>> imagine a desktop environment might want to put up such a special
>> client, like they have preference for their WM.
> Quite correct; this is a problem my standalone library also has right
> now. It's currently only supporting fullscreen clients properly.
I'm no expert in X event routing, maybe this isn't such a big problem
after all. As said, it's just a sketch.

>> § A new 'gesture' event gets created, like:
> [...]
>> A prosaic example would be an app learning that there's a
>> "DIRECTED_DRAGGING gesture going on, starting at (200, 100)@70 degrees,
>> now being at (300, 100)@95 deg" and use this information to navigate
>> within a 3D-view. Also note the omission of (x,y) from the general
>> gesture event, since I'd deem it specific. Other gestures may not have a
>> primary x,y.
> I agree, this is quite similar to the way I have implemented it right now.
> This applies, e.g., to a relative motion gesture which only delivers a
> vector.
I took this particular example from a table project I'll be working with
soon. It actually covers a vector _and_ a movement, independently
derived from two touch points. Nice for camera navigation.

>> § A special gesture client (composite-like)
>> This client might receive events as discussed - but all of them - by
>> virtue of registering with the server. It analyzes the stream, and
>> whenever it thinks something important happened, it tells the server.
>> The server then dispatches a corresponding gesture event, according to
>> its state and some constraints given by the special client (e.g.
>> Florian's event regions, gesture-specific delivery constraints, ...)
>> which may not be part of the event as delivered.
> What kind of events are you considering here? 
The hypothetical gesture event, as arriving in a gesture-aware client.
It may not contain information like delivery constraints; it's business
is just that it got the event.
> Could a client generate new XI events?
I don't have the impression this would suit it, though I could be wrong.
At any rate, a client can't create a new event _type_. It can create
some events though, e.g. via XTst, of predefined types.

My idea was that the special client instructs the server what gesture
events to generate and how to dispatch them, whenever it thinks it has
spotted a gesture. The server tracks some minimal state to ensure
consistency and dispatches on behalf of the special gesture client.

Peter, maybe you can comment how suitable current mechanisms for input
events from clients might be?

>> The important point here is that gesture events are asynchronous, so
>> there's no need to wait inside the event loop. Gestures correlate to,
>> but don't strictly depend on other input events. Their timestamps may
>> not be in order for this reason.
> Could you elaborate on that a bit more? I fear I'm missing some background
> information here.
All X events have a timestamp, but AFAIK order isn't guaranteed. The
event loop refers to the server's event processing loop, which mustn't
block or allocate memory and be nice in general. Guaranteeing
synchronous gesture events from out of that loop is next to impossible,
not to talk about communication with some client which may reside on
another machine.

Also, why should gestures be in absolute sync to the events they
originate from? You're normally interested in either direct events or
derived ones, i.e. gestures, but not both. Even if both, you have
timestamps to sort things out.

In short, dropping in-sync means making it feasible. IMO.

>> I never fully worked this out, so I can't offer a fancy paper, but it
>> seems sensible to me. And since anyone's free to do a library, when it
>> comes down to something, a special client infrastructure
>> might be a preferable choice.
> I'm very interested in putting a quick hack together to try this out.
> However, my knowledge about X internals is somewhat limited. Are things
> like custom events possible, and how would a (special) client go about
> sending them?
There's the Generic Event Extension:

I'd make one 'gesture' event, which multiplexes all sorts of gestures.
Or maybe three, one for start|cont|end gesture each. Whatever fits the

The special client would need to invoke an appropriate gesture dispatch
request on the server, maybe as part of a 'X Gesture Extension' (hey
that's XGE too :), which would then assemble and dispatch gesture events
(only). I don't really see alternatives to this because only the server
can properly dispatch events. But XTst should provide some examples to
steal from.

Obviously, the event needs to be designed along with the request, and
dispatch needs to be worked out. At that point, you should already have
yet another half-arsed X protocol extension spec.

The server should do some state tracking so you don't get gestures going
on without getting their start etc, but that's for when things need to
really work.

In reality, of course the client should 'be special', i.e. you need
"register/unregister gesture client" requests, but for a quick stunt
that's optional as well; no-one else will be sending the gesture
dispatch request, so there's no contention to prevent. I think there's
even a Xi(2) request for getting all the input events to a client, if
it's good enough for that case you don't need to do much special for the
special client.

But it's definitively more work than a library! Still, it may be more
rewarding. And these days, there's XCB which reduces the pain of writing
extensions. But maybe Peter's OK with just extending Xinput.

It's very rough so far. A real impl would probably need to have some
opportunity for client interaction too, e.g. an app canceling or
grabbing gestures, which I guess you have worked out in your paper.

I hope I could give a better picture of the idea. And of course, I'd be
delighted to see it realize.



More information about the xorg-devel mailing list