Finishing the network protocol

Andreas Hartmetz ahartmetz at gmail.com
Thu Mar 3 06:32:31 PST 2011


Hello again and sorry for taking a few days to reply.

On Wednesday 23 February 2011 16:40:27 Kristian Høgsberg wrote:
> On Tue, Feb 22, 2011 at 1:16 PM, Andreas Hartmetz <ahartmetz at gmail.com> 
wrote:
> > Hello,
> 
> Hi Andreas,
> 
> > I've started a Wayland implementation currently called "area", written in
> > C++ and with the main goal to work on all hardware that currently works
> > on Linux in some way (Framebuffer or X11).
> > I've started with the network code and noticed a few things that still
> > look like prototype code in Wayland, probably unchanged from early
> > versions.
> > Well, Wayland is becoming a serious project, so I think we should start
> > finishing and fixing the protocol. Additionally I don't know if I will
> > finish my own project so I'm looking to contribute my findings to the
> > main project.
> 
> Yup, there are still a number of unfinished items, even in the protocol.
> 
> > First some hopefully correct primer of the Wayland protocol for
> > interested onlookers.
> > - The Wayland protocol is a remote procedure call protocol of sorts. All
> >  messages are exchanged between objects; the protocol is asynchronous
> >  and no methods have return values as such.
> >  Methods can "return values" by triggering a message back.
> >  Everything is asynchronous, but order of messages is preserved.
> > - Each protocol-level object exists on both client and server
> 
> I think I would say that all objects live in the server and the
> clients have a proxy object for the objects they want to communicate
> with that lets then send requests and receive events.
> 
> > - It's easiest to think of all objects being created by the server on
> >  behalf of the clients.
> > - Object constructors don't return anything, they have an object ID
> >  argument that is pre-chosen and can later be used to refer to the
> >  created object. If creation succeeded there is no message back.
> 
> Yes, this is one of the things that I really like about the X protocol
> and that I wanted to preserve.  Creating objects is not a round trip,
> you pick the ID you want to use for the object and pass that to the
> server.  You can immediately send requests to that object, so you can,
> for example create a surface and buffer and attach the buffer in one
> write to the socket.
> 
> The asynchronous error mechanism is also an X-ism, and it works great
> except when something actually fails and the error is delivered later.
>  We can of course do a synchronous mode of operation like what Xlib
> has.
> 
> > - The conversation between client and server is bootstrapped by creating
> >  the "display" object on each client, which then starts talking to the
> >  global display object on the server.
> > 
> > I see the following issues:
> > - There is no way to subscribe to events, or rather there is no way not
> >  to subscribe to all events.
> 
> True, but I don't think that in itself is a problem.  The compositor
> doesn't a lot of events to begin with and almost none to clients that
> don't have keyboard or pointer focus.   So from the point of view that
> we don't want to spam the client with a lot of events, it's not a
> problem.  However, not having the subscribe mechanism means that we
> may send events from interfaces the client doesn't know about and if
> we had a subscribe mechanism, we would know that the client is
> listening and what version of the interface the clients expects.  So I
> do think we need a subscribe mechanism.
> 
> > - Range-based ID issuance for object IDs (obviously can't use pointers
> >  between processes) is not bulletproof. It is possible for ranges to
> >  become fragmented insofar that they can't be reclaimed because there's
> >  one ID in every range. There is also currently no code that tries to
> >  reclaim ranges.
> 
> Yes, this is a problem with the current implementation.  For the
> current scheme to work well, the clients should reuse the IDs and use
> up a range completely before starting to allocate out of a new range.
> There's no problem with reusing IDs, but it may be a little harder to
> understand the output of WAYLAND_DEBUG=1 when the same ID keeps
> getting assigned to different objects.
> 
> >  The practical implication is that a Wayland server can, by design, not
> >  run indefinitely without exhausting ID (range)s.
> >  Another kind-of-problem is that a client can interfere with another
> >  client's operation by, intentionally or not, using IDs belonging to
> >  the other client.
> 
> Yup, this is a problem.  The compositor could check that the ID a
> client passes to create a new object falls into one of the ranges
> assigned to the client, but it doesn't do that now, and I'd rather not
> have to do that anyway.
> 
> > I've looked at the TODO and come up with a few ideas of my own for the
> > following suggestions to modify the protocol:
> > 
> > - Have one ID<->object map per client, except for global objects where
> >  there is a global map in the server.
> >  This is suggested in the TODO file; I've done it this way right away.
> >  Obviously each client will have its own map in the client process
> >  anyway.
> > - Have three ID ranges:
> >   a) for global objects
> >   b) for client-specific objects created by the server (do they exist?)
> >   c) for client-specific objects created by the client
> >  where "created by" really means "creation initiated (assigning an
> >  object ID) by".
> > - Handle subscription without an extra mechanism by creating or not
> >  creating the object that will receive the desired events. Might need
> >  some splitting of existing objects.
> 
> The plan I have here, and you probably saw that in the TODO, is to
> just let the client manage the entire 32 bit namespace and have a
> 'bind' request:
> 
>     <request name="bind">
>       <arg name="global_id" type="uint"/>
>       <arg name="version" type="uint"/>
>       <arg name="id" type="new_id" interface="object"/>
>     </request>
> 
> that tells the server that the client will reference this global
> object by the given id and that the client is using the given version
> number of the interface.  This also serves as a subscription request,
> that lets the server know that the client wants to receive events and
> the server responds by sending out the initial state events.
> 
> Objects created by the client (surface, buffer etc) doesn't need the
> bind request, since the client provides the client id in the create
> request and obviously wants to receive events for that object.
> 
> >  This would IMHO be an elegant and minimal way to handle the matter.
> > - A scheme to recycle object IDs. When a new ID is needed, pick a free
> >  one at random. This introduces a problem:
> >  Suppose the client destroys object A with ID n, then by chance
> >  immediately reuses ID n for object B.
> >  The server will only receive this information later, the Wayland
> >  protocol being asynchronous and the server not having to respond to an
> >  object creation request, unless it goes wrong. In the meantime the
> >  server could send an event intended for A which would end up at B,
> >  causing Bad Things to happen - in my implementation most likely an
> >  assert failure unless the objects are of the same class.
> >  (This is the trickiest failure mode I could think of)
> >  The suggested solution is a kind of "rendezvous" for objects where this
> >  can happen, or for simplicity all objects:
> >  On both client and server, have a function that needs to be called
> >  twice to unregister an object ID.
> >  One call from the destructor of the local object when it destroys
> >  itself, one call from the remote counterpart object when it destroys
> >  itself. No matter in which order the method is called, the first call
> >  removes the ID<->object mapping and puts the object ID on a waiting
> >  list to avoid reuse. The second call removes the ID from the waiting
> >  list, making it free to reuse.
> 
> Ah, yes, very good point.  I was going to suggest that there could be
> an event that the server sends to acknowledge that it has bound and
> object to a client ID (whether through the bind request above or from
> the client creating a client object), but if we reuse the same ID too
> fast, then it's still unclear which object the ID refers to.  That
> could be fixed by adding a serial number, but at that point I think
> your idea is better and simpler.  In fact, with your idea, we can
> distinguish between events sent from an object that we've destroyed
> and server errors where the server sends an event from an object that
> never existed.
> 
I've started implementing this in my project and I've hit a problem: while the 
destroy() methods are basically regular methods that get an opcode the regular 
way, the confirmation message also needs an opcode.
The idea I outlined above, that I maybe didn't explain optimally, contains 
that a destroy request looks exactly like a destroy confirmation. That way both 
sides can handle destruction the same way: If the object was locally 
destroyed, destroy the instance and send a destroy() call to the other side. 
The other side will send back a destroy() when it's done destroying. Both 
sides will unregister the ID when destroy() has been both sent and received, 
keeping track with two boolean flags. That way exactly one destroy() call is 
sent to the other side.

In a nutshell, I think destroy() should implicitly be both a request and a 
response whenever it appears in a client-created object. The methods could 
actually be added implicitly to all client-created objects which wouldn't be 
more of a hack than "doubling" one ocurrence of destroy() or adding a flag like 
has_destructor="true", which btw would also be fine with me.

> > - Specify how parent-child relationships work, e.g. (bad example, the
> >  answer is probably no here) is a surface automatically destroyed when
> >  its screen goes away? By whom?
> > - Specify who gets do delete objects and how that looks in the protocol
> 
> I suppose we need an unbind to match bind.  Client created objects
> will always have a "destroy" request.
> 
> >  - this is probably more a matter of documentation; I didn't read all
> >  of Wayland's code carefully and implementers ideally shouldn't have to.
> > - Add information in the protocol description XML file about things like
> >  an object being global or not, and basically everything mentioned above
> >  that can benefit from help from the code generator.
> > 
> > I'm not publishing a repository URL right now because I haven't chosen a
> > license yet and because I've copied over the wayland.xml protocol
> > description file that bears no license header. Kristian, what about the
> > license of that file?
> 
> Oh, hmm, it's MIT licensed, but I'd rather it didn't get copied around
> too much since it's bound to get out of sync.  What if we make wayland
> install the protocol file in /usr/share/wayland or such?
> 
For the record - I've copied the file and added the same BSD license header as 
in the .c/.h files as an XML comment.

> > If there is interest I can polish my code a bit and publish an URL.
> 
> Yeah, that would be interesting.
> 
> Kristian


More information about the wayland-devel mailing list