[PATH] core: implement a safe wl_signal_emit

Thu Feb 22 16:03:03 UTC 2018

On 2018/2月/22 02:58, Daniel Stone wrote:
> Hi,
> 
> On 22 February 2018 at 14:14, Markus Ongyerth <wl at ongy.net> wrote:
> >> It seems that this patch makes that assumption invalid, and we would
> >> need patches to weston, enlightenment, and mutter to prevent a
> >> use-after-free during the signal emit?  Now I'm seeing valgrind errors
> >> on E and weston during buffer destroy.
> >>
> >> Personally, I don't think we should change this assumption and declare
> >> the existing code that's worked for years suddenly buggy. :/
> >
> > The code was buggy the whole time. Just because it was never triggered, does
> > not imply it's not a bug.
> > free()ing these struct wl_list without removing them from the signal list
> > leaves other struct wl_list that are outside the control of the current code
> > in an invalid, prone to use-after-free, state.
> 
> There's a difference between something being 'buggy' and a design with
> non-obvious details you might not like. If destroy handlers not
> removing their list elements were buggy, we would be seeing bugs from
> that. But instead it's part of the API contract: when a destroy signal
> is invoked, you are guaranteed that this will be the first and only
> access to your list member.

Could you kindly point me towards the point that states this?
Also the point that properly distinguishes two kinds of wl_signal_emit
semantics for wl_listener?

I don't want to say it can't be made part of the API, I'm saying it is not.
And at the current point it's a bug. If the decision is to fix these bugs, by 
changing the API or the code, is another issue.

(duplicated)
>when a destroy signal
> is invoked, you are guaranteed that this will be the first and only
> access to your list member.

This is wrong btw. This would prevent us from adding to a listener to a
signal that already has listeners.
Or use wl_signal_get on the destroy signal as soon as it has at least one 
listener.

The semantics you want to state is, that it's the last time it is accessed.
And even that gets iffy if someone wants to wl_signal_get a destroy listener 
at any point.

>This implies that anyone trying to remove
> their link from the list (accessing other listeners in the list) is
> buggy.

This would be rather intersting.
It would be a fair point to add this to the emit callback of destroy 
listeners, but do we have any way to determine if a wl_signal_emit is 
currently in flight for a signal?
What about the case where we have a struct that contains multiple destroy 
listeners to different components? How can we determine, the destroy event 
isn't a transitive event of another listener we have?
We could have a listener on a client destroy signal to clean ourself up, and 
one on a seats.
If such a seat is created by a virtual keyboard (I'm going on a limb here, but 
whatever), it would naturally be destroyed with the client.

Now if we are in the seat destroy signal, how do we determine what to do with 
our client listener?
If this is caused by a client destruction, we can't remove ourselves form the 
client anymore.
If this is not caused by a client destruction, we have to remove ourselves.

I can also easily point you towards usecases, where this would lead to 
pointless code duplication, since we can remove our destroy listeners on very 
similar events, with identical behaviour.
Except for whether we are allowed to (and in that cased forced to) 
wl_list_remove our listener, or we'd have to just do nothing with it, in the 
other case.

> 
> > Suddenly allowing this is a breaking API change (*some* struct wl_list inside
> > a wl_listener) can suddenly become invalid for reasons outside the users
> > control.
> 
> I don't know if I've quite parsed this right, but as above, not
> removing elements of a destroy listener list, when the listener is
> invoked, is our current API.
> 
> > Related to this entire thing:
> > In [1] you added tests for this and promote something, that is in essence, a
> > breaking change.
> 
> It's not a breaking change though: it's the API we've pushed on everyone so far.

It is a breaking change in libwayland.
Or could you kindly point me to what currently forces me to not remove my 
listeners in libwayland?
Current libwayland allows both versions as implementation detail, but one 
follows the wl_listener semantics, the other makes asumptions.

> 
> > It also makes good wrapper implementations into managed languages annoying.
> > For example (admittedly my own) [2] ensures a wl_listener can never be lost
> > and leak memory. It is freed when the Handle is GC'd.
> > To prevent any use-after-free into this wl_listener, it removes the listener
> > from the list beforehand.
> > I would very much like to keep this code (since it is perfectly valid on the
> > current ABI) and is good design in managed languages.
> 
> Sure, that is annoying. In hindsight, it probably wasn't a good API
> for particularly the new generation of managed languages. In the
> meantime, probably the easiest way to do this, and come into line with
> all the other users, would be to define a separate destroy-listener
> type which intentionally leaks its wl_listener link after being
> signaled, rather than removing it.

Could you kindly point me to the point where this became an API over an 
implementation detail?
I can't find any point in either the spec or the library documentation that 
guarantees any of the asumptions made for this to be valid.

Cheers,
ongy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20180222/ce2674ef/attachment-0001.sig>