Protocol backwards compatibility requirements?

Pekka Paalanen ppaalanen at gmail.com
Mon Apr 20 12:05:32 UTC 2020


On Thu, 16 Apr 2020 17:47:56 +1000
Christopher James Halse Rogers <chris at cooperteam.net> wrote:

> On Wed, Apr 15, 2020 at 14:27, Simon Ser <contact at emersion.fr> wrote:
> > Hi,
> > 
> > On Monday, April 13, 2020 1:59 AM, Peter Hutterer 
> > <peter.hutterer at who-t.net> wrote:  
> >>  Hi all,
> >> 
> >>  This is request for comments on the exact requirements for protocol
> >>  backwards compatibility for clients binding to new versions of an 
> >> interface.
> >>  Reason for this are the high-resolution wheel scrolling patches:
> >>  https://gitlab.freedesktop.org/wayland/wayland/-/merge_requests/72
> >> 
> >>  Specifically, the question is: do we **change** protocol elements or
> >>  behaviour as the interface versions increase? A few random examples:  
> > 
> > What we can't do is:
> > 
> > - Change existing messages' signature
> > - Completely remove a message  

Indeed.

> 
> It should be relatively easy to modify wayland-scanner to support both 
> of these things, *if* we decide that it's a reasonable thing to do. 
> (You'd do something like add support for <request name="foo" 
> removed_in="5"/> and the like)  

How would that work, given the version is negotiated at runtime?

The message signature structs are now ABI as well, and we have no room
for alternate signatures, do we?

> >   
> >>  - event wl_foo.bar introduced in version N sends a wl_fixed in
> >>    surface coordinates. version N+1 changes this to a normalized
> >>    [-10000, +10000] range.  
> > 
> > Argument types can't be changed. This would be a breaking change for 
> > the
> > generated code, we can't do that.  
> 
> But this isn't changing the argument type; it's changing the 
> interpretation of the argument.
> In both cases the type is wl_fixed; in the first you interpret this 
> wl_fixed as being in surface coordinates, in the second you interpret 
> it differently.
> 
> This doesn't require any changes to code generation; I don't think this 
> is (in principle) any more disruptive than changing “wl_foo.baz is 
> sent exactly once” to “wl_foo.baz is sent zero or more times”, 
> which you're happy with.

Something we rarely consider is if you pass Wayland protocol objects
into a library without negotiating the object version with the library
first. For example, we pass wl_surface into the EGL Wayland wrapper
library. If wl_surface would get a version bump breaking backwards
compatibility, meaning that version N+1 changes something that existed
in version N, the library handling only version N would fall apart.

I sincerely hope this is the only case of a library taking a ready-made
Wayland object in. Getting the version negotiation right needs
inconvenient additions to the library API that I don't think many would
bother or even realize it's needed.

You can query the version of a wl_proxy, sure, but that does not help
you if it returns a number larger than what your code knows about.

Btw. this is also a problem in the opposite direction. Let's say you
use a toolkit and the toolkit allows you access to the Wayland protocol
objects. Then the toolkit gains support for new interface versions and
uses them, but your app code is not updated. If the protocol change is
backwards incompatible, your app code may break even if only behaviour
changes and not signatures.

> >>  - request wl_foo.bar introduced in version N takes an int. version 
> >> N+1
> >>    changes wl_foo.bar to take a wl_fixed and an enum.  
> > 
> > Ditto.
> >   
> >>  - request wl_foo.bar introduced in version N guaranteed to generate 
> >> a single
> >>    event wl_foo.baz. if the client binds to version N+1 that event 
> >> may be
> >>    sent zero, one or multiple times.  
> > 
> > This is fine.
> >   
> >>  I think these examples cover a wide-enough range of the possible 
> >> changes.
> >> 
> >>  My assumption was that we only ever add new requests/events but 
> >> never change
> >>  existing behaviour. So wl_foo.bar introduced in version N will 
> >> always have
> >>  the same behaviour for any interface N+m.  
> > 
> > We can change existing requests' behaviour. This has already been 
> > done a
> > number of times, see e.g. wl_data_offer.accept or 
> > xdg_output.description.
> > 
> > Clients should always have a max-version, ie. they should never 
> > blindly bind
> > to the compositor's version.
> > 
> > What is also fine is marking a message as "deprecated from version 
> > N". Such a
> > message wouldn't be sent anymore starting from this version.
> >   
> >>  I've seen some pushback for above linked patchset because it gets
> >>  complicated and suggestions to just change the current interface.
> >>  The obvious advantage is being able to clean up any mess in the 
> >> protocol.
> >> 
> >>  The disadvantages are the breakage of backwards compatibility with 
> >> older
> >>  versions. You're effectively forcing every compositor/client to 
> >> change the
> >>  code based on the version number, even where it's not actually 
> >> needed. Or,
> >>  IOW, a client may want a new feature in N+2 but now needs to 
> >> implement all
> >>  changes from N+1 since they may change the behaviour significantly.  
> 
> This is the meat of the question - all of the changes described are 
> technically fairly simple to implement.

Breaking stuff is simple, sure. Or what do you mean?

> To some extent this is a question of self-limitations. As has been 
> mentioned, protocols have *already* been deliberately broken in this 
> way, and people are happy enough with that. As long as we're mindful of 
> the cost such changes impose, I think that having the technical 
> capability to make such changes is of benefit - for example, rather 
> than marking a message as “deprecated from version N” I think it 
> would be preferable to just not have the message in the listener 
> struct. (Note that I'm not volunteering to *implement* that capability, 
> and there are probably more valuable things to work on, but if it 
> magically appeared without any effort it'd be nice to have that 
> capability).

We cannot do this.

The simple reason is that the protocol object version is negotiated at
runtime. The code must always be generated for all versions from 1 up
to max version wanted. It is always possible that the program on the
other end of the Wayland connection implements only version 1.

> The status quo is that we're happy (perhaps accidentally) with 
> requiring a client to implement all changes from N+1 in order to get 
> something from N+2. I think whether or not that's ok is a case-by-case 
> decision. How difficult is it for clients to implement N+1? How much 
> simpler does the break make protocol version N+1? If it's trivial for 
> clients to handle and makes the protocol significantly simpler, I think 
> it's obvious that we *should* make the break; likewise, if it's likely 
> to be difficult for clients to handle and doesn't make N+1 much 
> simpler, it's obvious that we *shouldn't*.

Likewise it is not possible to cherry-pick features from version N+2
without also implementing version N+1 fully, because at runtime the
negotiation may end up with version N+1.

> For the specific case at hand, it doesn't seem like it would be 
> particularly difficult for clients to handle axis events changing 
> meaning in version 8, and it looks like the protocol would be 
> substantially simpler without the interaction between axis_v120, axis, 
> and axis_discrete.

Since we talking about wl_pointer specifically, let me remind us about
the interface hierarchy:

- wl_seat
  - wl_pointer
  - wl_touch
  - wl_keyboard

Wayland uses inheritance to determine protocol object versions, when an
explicit version is not provided. The only relevant interface here that
can create objects with an explicit version number is wl_registry.bind.
This can be used to set the wl_seat version only. Then wl_pointer,
wl_touch, wl_keyboard all get their version from the wl_seat object.

If you want to have wl_pointer version N+2 and wl_touch version N+1,
you have to create two different wl_seat objects for the same wl_seat
with different version numbers. Most clients do not do this, though.
It's simplest to just have one wl_seat object negotiated with the
highest possible version.

So the requirement to implement all earlier versions is not even
limited to the interface itself, it applies to the whole interface tree
starting from the global.

If you decide that wl_pointer version < N+2 is unsupported, you do so
for wl_seat, wl_touch, and wl_keyboard as well.

It is possible to simplify the messaging sequence for an interface tree
by saying that starting from version N, things behave differently. But,
then you need two implementations in both servers and clients: one
for < N and one for >= N.

If you're asking if the implementation for version < N could be
deleted or avoided, then I'd say no. Definitely no for desktop
compositors, probably no for anything else public.


Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20200420/f6a05b7a/attachment.sig>


More information about the wayland-devel mailing list