Wayland client library thread safety

Fri Apr 27 14:09:55 PDT 2012

On Fri, Apr 20, 2012 at 10:06 PM, Kristian Hoegsberg
<hoegsberg at gmail.com> wrote:
> On Thu, Apr 19, 2012 at 03:38:39PM +0200, Arnaud Vrac wrote:
>> Hello everyone,
>>
>> I am hitting a bug when using Qt5 and wayland on an embedded platform,
>> for which I have written a custom EGL backend. The problem is that Qt5
>> (QtQuick2 actually) renders in a separate thread, when the wayland
>> display has been created in the main thread. I know there is a patch
>> for "thread affinity" in qtwayland to solve this, but it is wrong and
>> breaks other clients.
>>
>> Since it is required from the EGL spec to be able to render in
>> multiple threads, how can this be solved ? Does the wayland backend
>> for mesa support this properly ? I don't see how it can work since
>> even writes in the connection queue are not protected.
>
> Hi Arnaud,
>
> It's a problem that's been around for a while and we're aware of it.
> I've been working with Jørgen Lind from Nokias Qt team on it and we're
> looking at two different approaches for solving this.
>
> Initial assumption for wl_display and related objects was that it is
> not going to be thread safe, and you have to lock access to the wl API
> yourself if you're going to use it from multiple threads.  It's a
> valid assumption and it avoids fine grained API level locking, and in
> many cases, wl would be integrated into an existing mainloop that
> solves locking.  Of course, EGL makes that kind of design impossible.
>
> So the two options we're looking at are
>
> 1) Toughen up and add locking around all access to wl_display.  It's
>   not as much code as it sounds, there are actually only a few entry
>   points that need locking.  Another problem that we need to solve is
>   how the wl_display_iterate() inside EGL may dispatch events, from a
>   rendering thread and (in case of the mesa implementation) with the
>   EGL lock held.  The solution there is to add thread affinity to
>   objects such that event callbacks from an object are always handled
>   in the thread that owns the object.
>
>   In case of single-threaded EGL applications we also need to make
>   sure we don't call out to random application callbacks while
>   blocking inside eglSwapBuffer (or other blocking EGL calls).  The
>   idea here is to introduce a wl_display_iterate_for_object() entry
>   point that will only call out for events addressed to that object
>   and queue up the rest.  This makes eventloop integration more
>   complicated, since now the question of "does this wl_display have
>   events?" isn't just a matter or whether or not the fd polls
>   readable, there may also be events queued up in the wl_display.
>   That's a part of Xlib API I've tried hard to avoid inflicting on
>   wayland users, but thanks to Xlib working this way, it's something
>   most event loops support.  We could also do some trickery with
>   epoll and eventfd to the "events pending" iff "fd readable"
>   semantics again.
>
>   Jørgen has been working on this approach, his patch is attached
>   (still work in progress).
>
> 2) Add a new wl_display for the other thread.  This gives EGL its own
>   wl_display, complete with a separate socket to the server.  This is
>   a more radical idea perhaps, but much less invasive.  We don't have
>   to change the assumption that wl_display is only usable in one
>   thread and we don't have to add locking.  We do need a mechanism to
>   share an object between the threads (typically the wl_surface), but
>   that's doable.  The bigger problem is how to synchronize the two
>   command streams.  Right now we can say eglSwapBuffer and then know
>   that that attaches a new wl_buffer to the surface and do something
>   that relies on that having happend, but if it happens in different
>   connections we need a mechanism to synchronize the two protocol
>   streams.  Of course, if you want to do something like that, you
>   still need to synchronize between the two threads to make sure the
>   eglSwapBuffer happened in one thread before you do something that
>   depends on that in anoteher thread.
>
>   The biggest drawback with this approach is that even single
>   threaded EGL applications now need to connections to the server.
>
> Anyway, that sums up the work we've done in this area.  I'd say that
> option 1) is the preferred solution at the moment.

I have tried to make the library thread safe without changing the API.
The result is the attached patch, which differs from Jorgen's patch in
the following ways:

 * no need for an added wl_display_get_notification_fd function.
 * TLS key is stored in the display structure, instead of using a global.
 * a thread is created on connect, it is dedicated to handle read and
writes to the display socket. The input events are queued in the
thread in which their proxy was created.
 * wl_display_flush is now unnecessary, since sending is done
asynchronously in the IO thread.
 * wl_display_iterate_for_objects is unnecessary (if you do not agree,
can you tell me what it would bring ?).
 * Added missing locks around the objects map accesses.

This is a proof of concept, but the patch should work out of the box
with any current wayland client. I would just like to know if this
approach could be considered for merging in master. Some optimisations
could be done, for example we can allow any thread to dispatch output
events. We could also implement finer locking around the display
connection, but that would mean changing the connection
implementation. The wl_display_connect function also needs some added
code in case of failure.

You can also check the code here:
https://github.com/fbx/wayland/commits/threaded-io

Please tell me what you think of this approach.

Thanks,

-- 
rawoul
-------------- next part --------------
A non-text attachment was scrubbed...
Name: wayland-threaded-io.patch
Type: application/octet-stream
Size: 17148 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/wayland-devel/attachments/20120427/3a0b0c51/attachment-0001.obj>