[PATH] core: implement a safe wl_signal_emit

Thu Feb 22 18:45:00 UTC 2018

On 2018/2月/22 04:53, Daniel Stone wrote:
> Hi ongy,
> 
> On 22 February 2018 at 16:03, Markus Ongyerth <wl at ongy.net> wrote:
> > On 2018/2月/22 02:58, Daniel Stone wrote:
> >> On 22 February 2018 at 14:14, Markus Ongyerth <wl at ongy.net> wrote:
> >> > The code was buggy the whole time. Just because it was never triggered, does
> >> > not imply it's not a bug.
> >> > free()ing these struct wl_list without removing them from the signal list
> >> > leaves other struct wl_list that are outside the control of the current code
> >> > in an invalid, prone to use-after-free, state.
> >>
> >> There's a difference between something being 'buggy' and a design with
> >> non-obvious details you might not like. If destroy handlers not
> >> removing their list elements were buggy, we would be seeing bugs from
> >> that. But instead it's part of the API contract: when a destroy signal
> >> is invoked, you are guaranteed that this will be the first and only
> >> access to your list member.
> >
> > Could you kindly point me towards the point that states this?
> > Also the point that properly distinguishes two kinds of wl_signal_emit
> > semantics for wl_listener?
> >
> > I don't want to say it can't be made part of the API, I'm saying it is not.
> > And at the current point it's a bug. If the decision is to fix these bugs, by
> > changing the API or the code, is another issue.
> 
> I can't point you to a document where it is written in stone, no. But
> it would break Weston, WebKit, Clutter, and some others. Changing that
> would be a break of our effective API: we've exposed behaviour which
> people have come to rely on, regardless of whether or not it's
> documented.

I don't mind putting up bug reports with all of them for their broken code.
Also I don't really conside Weston an outside consumer, it's developed over 
the same channels as libwayland. If you want the fixes to weston attached in 
the patch series, I'd be glad to whip them up.

Qt relies on this not being the case (they remove listeners from the list in 
destructors), we rely on it not being the case. If I found a python or java 
wrapper, I would expect them to use detructors as well. It's not just a 
feature of nieche languages like Haskell.
IF we want to guarantee any kind of interop between different libraries, we 
need one (and only one) way to do this. Remove ourselves from the list, or 
free and leave the list broken.

I do agree, as long as we keep to ourselves, we could have the policy to just 
do it as we did so far. But what if we have a library that can be used 
standalone? Then we should keep to the official specification on what to do.
Thus all code has to be adjusted. There is no room for private breaking of 
this.

How do you determine which projects to piss off here?
One side has implemented measures to keep code correct and prevent accidently 
leaking memory or produce use-after-free by forcing a removal before memory is 
freed.
The other side broke basic C sanity by freeing memory that's still pointed 
into.

(Side note: I'm of course only slightly biased here, and this is not written 
in any suggestive way at all)

> 
> Equally, you could say that wl_display_get_fd() is 'buggy' because it
> should follow the established convention of duplicating FDs returned
> by a library, so users can be in control of lifetime. You could 'fix'
> that by having wl_display_get_fd() duplicate the FDs it returns, and
> when every Wayland client using an event loop complains that we're now
> causing them to leak FDs, tell them it's their own fault for relying
> on behaviour which was not formally documented as API/ABI. But that's
> really not a helpful answer, and it's a very good way to get people to
> stop using your software.
The doc of that one actually says "the", not "a" or "a copy of", so I'd 
actually say you are breaking things that you shouldn't :)

I have to admit, I'm making asumptions as well. Based on basic sanity (IMO).
It's convention that explicitly state when something returned by a call has to 
befreed by the caller, so not stating that implicitly states things.

I of course also asume memory I have pointers into doesn't get free()'d at 
arbitrary times without any way for me to detect it.
If this is an asumption that should not be may around libwayland I can 
guarantee, you will be free from my complaints.

And see the section above as to why I think we have to have a certain amount 
of breakage in either direction.
We can't just say "do whatever you want" if we expect code sharing instead of 
monolithic applications.

> > We could have a listener on a client destroy signal to clean ourself up, 
> > and
> > one on a seats.
> > If such a seat is created by a virtual keyboard (I'm going on a limb here, but
> > whatever), it would naturally be destroyed with the client.
> > Now if we are in the seat destroy signal, how do we determine what to do with
> > our client listener?
> > If this is caused by a client destruction, we can't remove ourselves form the
> > client anymore.
> > If this is not caused by a client destruction, we have to remove ourselves.
> 
> That's arguably somewhat contrived: you're not concerned about the
> lifetime of the client per se, but just the lifetime of the object. So
> you can just keep your destroy listener on the object, and know that
> if the client gets destroyed then the object will as well. The same
> goes for related objects, e.g. wl_surface -> xdg_surface. But it is a
> good point, and definitely shows reason to be really careful about
> what we do here.

I'm not going to deny that it's a bit contrived, I made it up no the spot.
We do have the client resources asociated to binding seats though. And once we 
have clients that are allowed to emulate input, this might actually be a 
possible situation.

E.g. the org_kde_kwin_idle protocol be interested in client and seat destroy
events. And has no way to associate to each other, should one be capable of 
transitivly destroying the other.

> 
> >> > Related to this entire thing:
> >> > In [1] you added tests for this and promote something, that is in essence, a
> >> > breaking change.
> >>
> >> It's not a breaking change though: it's the API we've pushed on everyone so far.
> >
> > It is a breaking change in libwayland.
> > Or could you kindly point me to what currently forces me to not remove my
> > listeners in libwayland?
> > Current libwayland allows both versions as implementation detail, but one
> > follows the wl_listener semantics, the other makes asumptions.
> 
> That's true, though in practice you have to have the entire stack of
> listeners for that resource completely locked down so they do the same
> thing. Given the prevalence of implementations which _don't_, we
> effectively chose one to encode which precludes the other.

That's an asumption I'd honestly not feel comfortable with.
Especially as someone who is involved in language bindings as well, I'd have 
to pick one, or reduce the safety I can provide in the binding level to 
support this. Neither of those sound particularily enticing to me.
And what should libwayland do, if it ever has a listener on a destroy event 
emitted by itself? It has to do either or. (Well, not quite. There's a hack.)

Can you help me with the last sentence? There's probably a small language 
barrier here, I can't quite parse that in context.
Or you just don't make sense to me :). Since the prevelance of implementations 
that _don't_ points towards locking down the stack for me, which I don't think 
is what you want to refere to here.

> 
> >> > It also makes good wrapper implementations into managed languages annoying.
> >> > For example (admittedly my own) [2] ensures a wl_listener can never be lost
> >> > and leak memory. It is freed when the Handle is GC'd.
> >> > To prevent any use-after-free into this wl_listener, it removes the listener
> >> > from the list beforehand.
> >> > I would very much like to keep this code (since it is perfectly valid on the
> >> > current ABI) and is good design in managed languages.
> >>
> >> Sure, that is annoying. In hindsight, it probably wasn't a good API
> >> for particularly the new generation of managed languages. In the
> >> meantime, probably the easiest way to do this, and come into line with
> >> all the other users, would be to define a separate destroy-listener
> >> type which intentionally leaks its wl_listener link after being
> >> signaled, rather than removing it.
> >
> > Could you kindly point me to the point where this became an API over an
> > implementation detail?
> > I can't find any point in either the spec or the library documentation that
> > guarantees any of the asumptions made for this to be valid.
> 
> If we change it, multiple external consumers of this API entrypoint
> crash. For me, by default that enshrines something as API rather than
> an implementation detail.

Multiple external consumers are liable to crash if you allow the behaviour of 
these clients, which makes it inherently a breaking change to the API :)

I think we have a basic disagreement on the effect it would have to make this 
an officialy supported way to use the API over an implementation detail.

Partialy based by your asumption, that things are monolithic. Which I don't 
want to subscribe to.

It is true, that this won't induce any crashes in those consumers ATM (as it 
doesn't change code, just docs), but if such a consumer intends to write 
correct software that adheres to the API, it is a breaking change onto them.

I think the both of us see the breaking change in different ways to go foward 
from each other.
Which may well be, because I'm more interested in correctness than being 
"polite" to projects.

In the end I can't prevent anyone from doing this, but I would kindly ask to 
do this properly, which needs a series of amendments on the docs, and call it 
the breaking change it is (if you still don't believe me on that, start on the 
docs, I will point it out specifically).

Cheers,
ongy
P.S. I'm aware I'm opinionated, wouldn't be fun otherwise.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20180222/6b38edaa/attachment.sig>