I'm trying to remote individual X11 apps -- Please help!

Fri Jan 5 11:40:20 PST 2007

I am trying to figure out a reasonable way to allow invidual X11
applications to be remoted to users on other machines. This would be
different than what VNC does; rather than remoting an entire desktop
only the windows of a specific application (and its attendant popup
windows) would be remoted. By "remoted" I mean that the pixels in the
window show up in a proxy window on the remote users machine and, in
addition, the remote user can take control of the application and the
events sent to this application come from the remote user (not from
the local user).

The reason why I need to remote individual apps is because I have a
user case which requires that the local user still be able to run and
manipulate unremoted apps on the local desktop and, in addition,
different users on different machines can take control of different
remoted apps on the local desktop. And, as if this isn't hard enough,
this use case requires that the remote user is not necessarily running
X; it could be any operating system on the remote side.

The remoting of the bits is easy; VNC provides a good existence
proof. What is hard is the handling of input events. Basically,
somebody needs to process raw mouse and keyboard events into
high-level X11 protocol events which are sent to the
application. Basically, all of the processing in the X server events.c
must be done: grab handling, window picking, upward propogation, etc.

The only solution I have thought of to date is to introduce the
concept of "multiple input contexts" into the X server. An input
context would contain all of the global event processing state
(inputInfo, sprite, syncEvents and a few others). The window manager
(or compositing manager, or whatever) can create new "secondary" input
contexts and "bind" them to specific X11 clients.  Raw mouse and
keyboard events would come from the remote proxy and would be injected
(via an extension request) into the client's secondary context.

During each event processing time slice the X server would first
process available events in the primary context and then process
available events in all secondary contexts, in turn. While a specific
context is being processed that process is made "current" and all of
the global event processing state comes from that particular context.

Also, whenever the X server is processing requests, the input context
of the requesting context is made current. This ensures that any
event-related state changes made by the client update the proper
context.

Like I said earlier, this is the best idea I have thought of so far
which satisfies the requirements of my use case. The *BIG PROBLEM*
with it is that it entails significant changes to the X server. But I
haven't been able to think of anything else which allows raw events
from the remote proxy to be properly processed into X11 protocol
events with reasonable exactitude.

Can anyone out there help me? Does anyone know of someone who has
tried implementing this sort of feature before? I am open to
suggestions.

        -Deron Johnson
        Project Looking Glass
        Sun Microsystems
        deron.johnson at sun.com
        (510) 996-7190