[PATCH 1/2] protocol: Allow versioned message arguments

Alexander Larsson alexl at redhat.com
Mon May 13 04:07:21 PDT 2013


On mån, 2013-05-13 at 12:19 +0300, Pekka Paalanen wrote:
> On Mon, 13 May 2013 10:23:44 +0200
> Alexander Larsson <alexl at redhat.com> wrote:
> 
> > On ons, 2013-05-08 at 15:40 -0500, Jason Ekstrand wrote:
> > 
> > > 
> > > In short, I think this is far too complex for what it achieves.  In
> > > the case of scaling factor stuff, you can just do it with a second
> > > event.
> > 
> > I agree that what I posted have some open issues, and it was mostly
> > meant as a start of a discussion of how to handle this. But I completely
> > disagree with this. Adding a second event is totally a bad thing. I'll
> > try to explain my reasoning in this case, but also consider that over
> > time this will happen in other events too, so the complexity will
> > accumulate over time.
> > 
> > So, wl_output. This API is meant to be used to find out the initial
> > monitor geometry and later dynamic updates of it. One of the primary
> > consumers of this will be GdkScreen/GdkMonitor which is the Gtk+ api
> > that gives this information. We never want Gtk+ to report some partial
> > state to the application, so a multi-event geometry notification means
> > that Gtk+ would have to have some code like:
> > 
> >  // just got a geometry event 
> >  if (wl_output.version <= 2)
> >    update GdkMonitor based on event
> >    screeen.emit("changed");
> >  else
> >    // Can't update the data or notify until we have all parts
> >    screen.saved_geometry_event_data = event;
> >    screen.waiting_for_geometry_scaling_factor = 1;
> 
> This is the way to do it. As another way, rather than expecting a
> specific event and pending on that indefinitely, you could issue a
> roundtrip; when it returns, you are guaranteed to have received all the
> additional events related to this event. OTOH, roundtripping is not
> nice to use like this.

I don't think this is particularly nice, its just pushing complexity
unnecessarily on everyone else. It means over time there will be more
and more cruft accumulating in the wayland apis that applications need
to handle in complex ways like this. Its fine for a single one, but
imagine how it will look in 10 years, when every client needs to have a
spaghetti nest of version checks with different behaviour depending on
the server version. Its much nicer if the protocol implementation itself
can handle this (as far as possible).

> There is a slightly racy way to achieve the same, though: wait for
> idle. You wait for idle in any case before re-drawing the GUI, right?

I don't see how this would reliably work. If we get an event and handle
it, we're idle immediately. So it will constantly race with the second
event showing up in the pipe.

> In any case, a compositor can be assumed to send all related events in
> a burst, without interleaving other events. We can even specify the
> order in the protocol. We already do that for other things.

Relying on the burstiness is racy, so we need to define an exact order
and version behaviour, and every client and server need to open code
these exactly right or we'll get problems. This is certainly doable and
it makes things a lot easier for me to do this patch (although it makes
me cry as a gtk backend developer).

I guess in this particular case we can define that a new scale for a
wl_output must be sent before the matching geometry event. That way it's
easier for clients to handle. Just keep any scale events until you get
the next geometry event and then apply it with that data.

> > * The client depends on a complex behaviour of the server (send the 2nd
> >    event if wl_output.version > 2) which some compositors may not
> >    implement (due to a bug or mistake) which leads to client
> >    "livelocks".
> 
> This is the standard way to implement augmented protocol. If some
> compositors don't do it right, they are buggy and need to be fixed.
> Automatically and silently papering over the problem would make things
> only harder to debug.

Its certainly always what happens on *some* level. But it need not be
how it is visible to client applications. I guess its something of a
design decision of how high-level the wayland client library wants to
be. Does it match the bits in the wire protocol, or is it a slightly
higher level.

> > Or maybe I misunderstood you? Did you mean that the new event would be a
> > copy of the first event with the extra field added? So, there would be
> > geometry & geometry2. That would be a lot easier on the client side, all
> > you have to do is listen to both and fill in a default value for the new
> > fields in the handler for the old one. There are still some minor issues
> > in this apprach though:
> > 
> > * You need to send both messages over the wire. (Technically there is
> > some version negotiation when you bind the wl_output, but I don't think 
> > it seems right that incrementing the output version breaks old code that
> > used the old geometry event.)
> 
> It does not break. That's why the version used by both sides is
> negotiated. Servers need to honour the negotiated version number
> regardless.

I think its nice that negotiating version lets us later add new members
to the wl_output_listener struct so the server can then know to send
these events to your client, and we then don't call uninitialized
pointers for clients using older versions of the API.

But, I think its utterly broken if old APIs *break* when you increment
the negotiated version. Like, say you use the old geometry callback and
want an unrelated new event, so you bump the negotiated version to what
the new event needs. If this silently causes the old geometry callback
to not be called anymore this is broken.

And yeah, it seems to me that a client can cause an older server linked
to a new libwayland-server to crash by calling a new request that was
added to libwayland-server but is not yet handled by the server. The
code does verify that you don't send requests that it doesn't know about
via (opcode >= object->interface->method_count), but method_count is
compiled into the server library, not the server.

> Also, with your proposal, what happens if a server is built with an old
> libwayland-server, but at runtime links to a new libwayland-server?

I thought krh said the libwayland-server APIs were unstable. Certainly
if not we can't do anything like that.

> As far as I know, we already have the necessary mechanims in place to
> ensure libwayland API and protocol backwards compatibility, which
> allows any new/old combinations server vs. client, and also program vs.
> library at build vs. runtime. If you find the existing mechanisms
> somehow broken in that respect, we'd like to hear about it.

I don't think its "broken". All you need to extend *any* protocol
whatsoever is a single unused bit somewhere. "If this bit is set we're
doing something else". 

But, I also don't think that is the best API to handle extensibility of
a library. If it is *possible* to centrally manage the backwards
compatibility of the protocol rather than having every client do it that
makes for a better tested and more maintainable codebase.

> I don't think versioning event/request arguments, and plugging in
> arbitrary default values, is the right way.

I obviously disagree. And I don't understand what is arbitrary about
default values. They are defined in the actual protocol specification,
and you can't pick arbitrary values for them. The person who extends a
given event has to pick a default value that keeps the semantics of the
un-extended event. Which typically means "disable new feature", which in
the case of e.g. adding a new transform, means "scale by 1", "rotate by
0 degrees", "translate by 0,0", etc.




More information about the wayland-devel mailing list