[RFC wayland] Add wl_proxy destruction callbacks

Tue May 31 08:18:25 UTC 2016

On Mon, 30 May 2016 13:10:42 +0200
Miguel Angel Vico <mvicomoya at nvidia.com> wrote:

> Hi all,
> 
> A few days ago, I had a little chat over IRC with Pekka about addition
> of proxy objects destruction callbacks to the wayland client protocol.
> 
> 
> Summing up, we recently ran into some applications where native objects
> (wl_surface, wl_egl_window, wl_display) used by EGL are destroyed/made
> invalid before destroying the corresponding EGL objects. This sometimes
> causes crashes of the EGL driver, which is not nice. We have seen this
> with the NVIDIA EGL implementation, but I assume the Mesa EGL
> implementation is similarly exposed.
> 
> I agree this is in fact an application bug, but the EGL spec states that
> functions such as makeCurrent or swapBuffers should return error (not
> crash) if the native objects become invalid. I also agree the spec
> should have been clearer and probably allowed "undefined behavior", but
> it is not the case.
> 
> Having an objects destruction notification mechanism such as destruction
> callbacks would allow us to satisfy the spec.
> 
> Also, such functionality would also make life way easier under certain
> circumstances. I'm basically thinking about multi-threaded applications,
> where several threads make use of the same native objects, and for some
> reason one of the threads has to destroy one or more of them due to some
> sort of error happening.
> 
> Of course, this can still be considered an application bug, and the
> application could still make sure none of the threads is going to use
> the native objects before destroying them, but again, specs allow users
> to do many non-recommended things.
> 
> I think we should try to do our best to gracefully handle those
> non-desirable API usages, and avoid crashes whenever is possible.
> 
> 
> Pekka did not see this as something crazy to have, but wanted to hear
> from some of the toolkits guys before making the decision of whether
> changing the wl_proxy ABI is a good idea.

Hi,

indeed. I think we could do such an addition to the wl_proxy ABI
safely, if we want to. I just don't know if we want to.

I'm not so sure about the point of helping multithreaded applications.
Applications use toolkits, and toolkits' job is to support whatever
multi-threading ways they want, because they probably have to
explicitly support threading anyway. The issue is use cases where there
is no toolkit in between: EGL.

Another point that diminishes the multi-threaded use case is that one
cannot have multiple listeners on a wl_proxy anyway. If you need to
process events for the same object in more than one thread, you must
have the support code outside of libwayland-client. Also, the thread
assignment with listeners (wl_event_queue) must be done on the object
creation. It cannot be done afterwards without potential races. Again
EGL is somewhat special, since it does not need to listen on externally
created objects.

Then there is the question of what the locking should look like.

The EGL specification language is troublesome. I guess it was written
with X11 in mind, where anyone even outside of the current process can
go and destroy the native window at any time, without any coordination.
Obviously it is desireable that it does not lead to a crash. However,
that is not possible on Wayland. No external actor can go and destroy
the wl_display or wl_surface from behind your back, only the
application itself can do that. Therefore, I think if the application
deliberately shoots its own foot, a crash is ok, just like if you pass
a bad pointer through EGL API in the first place. I am very tempted to
say that this should be raised as a EGL spec bug, to allow undefined
behaviour or such. EGL platform specifications could then make their
own additional requirements, like on X11 you must not crash if Window
disappears.

Those were the cons, then the pros not yet mentioned:

Currently there is a problem with identifying Wayland protocol objects
referenced in incoming events, in case there are multiple components
creating them. See Section "Multiple input handlers" in
http://ppaalanen.blogspot.fi/2013/11/sub-surfaces-now.html
for an example of a problematic case. It is not the only one.

If wl_proxy had destroy listeners like wl_resource does, one could use
a wl_signal_get()/wl_resource_get_destroy_listener() kind of an
approach to fetch a component-specific struct, rather than assuming who
happens to own the singular "user data" pointer and risk a crash.

> 
> As an alternative, destruction callbacks could be hung off of
> wl_egl_window. In a similar way we support wl_egl_window_resize
> callbacks, we could support wl_egl_window_destroy callbacks.

At the moment I would slightly favour the wl_egl_window approach if it
allows to circumvent the EGL wording and the wording cannot be fixed.

> However, this isn't as foolproof as adding wl_proxy destruction
> callbacks, since destruction of wl_surface or wl_display objects before
> wl_egl_window would lead to same issues.
> 
> 
> I really think adding destruction callbacks to wl_proxy would be an
> improvement worth making, but others' thoughts must be heard first.

Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20160531/f7048047/attachment-0001.sig>