[Xcb] [PATCH] fix deadlock with xcb_take_socket/return_socket v2

Jamey Sharp jamey at minilop.net
Tue May 14 12:45:17 PDT 2013


Sorry I didn't reply sooner, and thanks for looking into this!

I'm not convinced you have the right diagnosis, though:

If a thread owns the XCB socket, it should not ask XCB to write on that
socket. The whole point of socket ownership is that XCB won't write
without taking the socket back. (xcb_take_socket's documentation says
this, but probably should be more clear.)

libX11 carefully drops the Display lock around the xcb_generate_id calls
that could result in XCB trying to write, and as far as I can tell it
doesn't ask XCB to send requests any other time.

Therefore, barring further evidence, I believe this note from commit
933aee1d5c53b0cc7d608011a29188b594c8d70b is still the correct
explanation for https://bugs.freedesktop.org/show_bug.cgi?id=30450 .

- If one thread is waiting for events and another thread tries to read a
  reply, both will hang until an event arrives. Previously, if this
  happened it might work sometimes, but otherwise would trigger either
  an assertion failure or a permanent hang.

That's purely a libX11 bug, but may need new API in XCB to maintain all
the invariants. See the comment introduced in that commit for context:

/* Thread-safety rules:
 *
 * At most one thread can be reading from XCB's event queue at a time.
 * If you are not the current event-reading thread and you need to find
 * out if an event is available, you must wait.
 *
 * The same rule applies for reading replies.
 *
 * A single thread cannot be both the the event-reading and the
 * reply-reading thread at the same time.
 *
 * We always look at both the current event and the first pending reply
 * to decide which to process next.
 *
 * We always process all responses in sequence-number order, which may
 * mean waiting for another thread (either the event_waiter or the
 * reply_waiter) to handle an earlier response before we can process or
 * return a later one. If so, we wait on the corresponding condition
 * variable for that thread to process the response and wake us up.
 */

And the actual bug is commented in that commit too:

/* If some thread is already waiting for events,
 * it will get the first one. That thread must
 * process that event before we can continue. */
/* FIXME: That event might be after this reply,
 * and might never even come--or there might be
 * multiple threads trying to get events. */

My intuition is that new XCB API allowing callers to retrieve the next
response, regardless of whether it's an event, error, or reply, might
solve this bug. But I haven't had time to work out the details since
writing that commit.

I'd love to see proposals for fixing this!

Jamey
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.freedesktop.org/archives/xcb/attachments/20130514/d5e243e8/attachment.pgp>


More information about the Xcb mailing list