Protocol backwards compatibility requirements?

Tue Apr 21 08:49:17 UTC 2020

On Tue, 21 Apr 2020 10:57:34 +1000
Christopher James Halse Rogers <chris at cooperteam.net> wrote:

> On Mon, Apr 20, 2020 at 15:05, Pekka Paalanen <ppaalanen at gmail.com> 
> wrote:
> > On Thu, 16 Apr 2020 17:47:56 +1000
> > Christopher James Halse Rogers <chris at cooperteam.net> wrote:
> >   
> >>  On Wed, Apr 15, 2020 at 14:27, Simon Ser <contact at emersion.fr> 
> >> wrote:  
> >>  > Hi,
> >>  >
> >>  > On Monday, April 13, 2020 1:59 AM, Peter Hutterer
> >>  > <peter.hutterer at who-t.net> wrote:  
> >>  >>  Hi all,
> >>  >>
> >>  >>  This is request for comments on the exact requirements for   
> >> protocol  
> >>  >>  backwards compatibility for clients binding to new versions of   
> >> an  
> >>  >> interface.
> >>  >>  Reason for this are the high-resolution wheel scrolling patches:
> >>  >>    
> >> https://gitlab.freedesktop.org/wayland/wayland/-/merge_requests/72  
> >>  >>
> >>  >>  Specifically, the question is: do we **change** protocol   
> >> elements or  
> >>  >>  behaviour as the interface versions increase? A few random   
> >> examples:  
> >>  >
> >>  > What we can't do is:
> >>  >
> >>  > - Change existing messages' signature
> >>  > - Completely remove a message  
> > 
> > Indeed.
> >   
> >> 
> >>  It should be relatively easy to modify wayland-scanner to support 
> >> both
> >>  of these things, *if* we decide that it's a reasonable thing to do.
> >>  (You'd do something like add support for <request name="foo"  
> >>  removed_in="5"/> and the like)  
> > 
> > How would that work, given the version is negotiated at runtime?
> > 
> > The message signature structs are now ABI as well, and we have no room
> > for alternate signatures, do we?  
> 
> Sure we do. Internally we can just give them different names, with 
> different contents, and switch based on the version requested at 
> runtime.
> 
>  From the client API side it's more difficult (at least for requests), 
> because we can't remove any symbols - we *can* make it a client error 
> with a good error message, though.
> 
> On the events side it's easier, as we can add a wl_foo_listener_v5 
> struct and wl_foo_add_listener_v5.
> 
> This does add a new sharp edge to the raw wl_proxy_* interface, but 
> client code isn't expected to be using that and this doesn't seem 
> particularly hard for language bindings to adapt to.

I guess I'd have to see all that implemented to see how it would work.
It's too far a stretch for my imagination atm.

What I can imagine is that it will lead to a lot of boilerplate code to
use different types with different object versions at runtime when you
cannot require a minimum negotiated version greater than 1.

> > Something we rarely consider is if you pass Wayland protocol objects
> > into a library without negotiating the object version with the library
> > first. For example, we pass wl_surface into the EGL Wayland wrapper
> > library. If wl_surface would get a version bump breaking backwards
> > compatibility, meaning that version N+1 changes something that existed
> > in version N, the library handling only version N would fall apart.
> > 
> > I sincerely hope this is the only case of a library taking a 
> > ready-made
> > Wayland object in. Getting the version negotiation right needs
> > inconvenient additions to the library API that I don't think many 
> > would
> > bother or even realize it's needed.
> > 
> > You can query the version of a wl_proxy, sure, but that does not help
> > you if it returns a number larger than what your code knows about.
> > 
> > Btw. this is also a problem in the opposite direction. Let's say you
> > use a toolkit and the toolkit allows you access to the Wayland 
> > protocol
> > objects. Then the toolkit gains support for new interface versions and
> > uses them, but your app code is not updated. If the protocol change is
> > backwards incompatible, your app code may break even if only behaviour
> > changes and not signatures.  
> 
> This is an additional cost that should be considered for types that may 
> be transferred across library boundaries like this; we should also try 
> to make it clear to toolkits that this is a fraught API.

Sure. How? By who?

> >>  This is the meat of the question - all of the changes described are
> >>  technically fairly simple to implement.  
> > 
> > Breaking stuff is simple, sure. Or what do you mean?  
> 
> Making breaking changes to version N+1 of a protocol in a way that 
> preserves the ability for version 1…N clients to continue to function 
> unchanged is technically fairly simple. The question is when we 
> *should*, and how much effort we should invest in capitalising on such 
> changes.

Do you make some assumptions about the problem I described when passing
protocol objects between independent components/libraries? Like that
the problem does not exist? Or that it's ok to disconnect if the object
version is not at least N+1?

> >>  The status quo is that we're happy (perhaps accidentally) with
> >>  requiring a client to implement all changes from N+1 in order to get
> >>  something from N+2. I think whether or not that's ok is a 
> >> case-by-case
> >>  decision. How difficult is it for clients to implement N+1? How much
> >>  simpler does the break make protocol version N+1? If it's trivial 
> >> for
> >>  clients to handle and makes the protocol significantly simpler, I 
> >> think
> >>  it's obvious that we *should* make the break; likewise, if it's 
> >> likely
> >>  to be difficult for clients to handle and doesn't make N+1 much
> >>  simpler, it's obvious that we *shouldn't*.  
> > 
> > Likewise it is not possible to cherry-pick features from version N+2
> > without also implementing version N+1 fully, because at runtime the
> > negotiation may end up with version N+1.  
> 
> I don't think that's actually true, though? If we had a protocol where 
> a client could handle version N or version N+2 but *not* version N+1 
> and the compositor advertises version N+1 then the client could simply 
> bind version N and go about its business.

Ok, you modify the version check in the client. That's a novel idea.

> But I think we might be talking about different things here? This is 
> not about a compositor *not supporting* a version of a protocol.

I suppose that depends on whether version N+2 removes everything added
in version N+1. If so and the only versions you agree to bind to are <= N
or >= N+2, then you can ignore N+1.

But this is only something a client could do. If a compositor
advertises version N+2, it can never skip implementing N+1 (unless
it disconnects clients that try to bind N+1).

The asymmetry here is surprising to me. It also seems very complicated.
I believe it is much "simpler" to just have to implement everything up
to and including version N+1 to reach version N+2 - the path of least
surprise. Every time you want to skip something, you cannot actually
skip studying and understanding the versions you want to skip in order
to see that your skipping works. I think that is a considerable burden
to the developer, reviewers, and the people debugging problems later.
What you win by omitting code with clever tricks I think you lose by
having to add documentation explaining why that is ok. Special cases
are still special cases and could be done differently, but I believe
the intent here was to set general guidelines.

Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20200421/afa45599/attachment.sig>