[Xcb] [PATCH 3/3] fix pthread_cond_wait and get_socket_back dependency

Uli Schlachter psychon at znc.in
Fri Jun 7 08:11:06 PDT 2013


Hi,

On 05.06.2013 23:38, Jamey Sharp wrote:
> On Thu, May 16, 2013 at 11:35:07AM +0200, Christian König wrote:
>> Am 16.05.2013 02:02, schrieb Jamey Sharp:
>>> Consider, for example: What happens if xcb_take_socket is called from
>>> two threads at once?
>>
>> Not much, changing socket ownership is protected by the iolock. So
>> when two threads call into xcb_take_socket the thread who can
>> acquire the iolock wins and gets the socket. The other thread gets
>> the iolock after the first returns from xcb_take_socket and so calls
>> return_socket to get the socket back to xcb before giving it out
>> again.
>>
>> But that's exactly the argument why the current implementation
>> doesn't work correctly, any caller of xcb_take_socket must make sure
>> that the socket isn't stolen by another thread. The conventional way
>> to do so is to use a locking primitive, which is acquired by the
>> caller of xcb_take_socket and inside return_socket.
> 
> Multiple libraries are intended to be able to share XCB's socket. For
> instance, I might build an application using Xlib, cairo, and a
> higher-level language binding to XCB that does its own request
> marshalling.
> 
> Xlib uses its Display lock to prevent other threads from calling
> xcb_take_socket from inside Xlib, but that doesn't stop another thread
> from calling xcb_take_socket from cairo, or from the language binding.
> 
> I believe those cases are OK, though, because if (for instance) Xlib
> gets the socket, then when cairo tries to take it, XCB will call Xlib's
> return_socket, and that will block on Xlib's Display lock until Xlib is
> done using the socket.
> 
> However, getting back to this patch: If there's a bug of the sort you're
> trying to fix here, then if Xlib's running xcb_take_socket on one
> thread, and cairo is running it on another, I expect that bug will
> apply.
> 
> Therefore, I think either there is not a bug here, or this patch isn't
> complete--and since I can't prove to myself that the patch is correct, I
> can't tell which.

I have difficulties trying to understand this.

As far as I can tell, this patch is supposed to solve this problem:

Some thread A is currently writing (_xcb_conn_wait(), called via
_xcb_out_send()), thus c->out.writing != 0. Now thread B calls
xcb_send_request() and ends up waiting on c->out.cond until thread A is
finished. Thread C now calls xcb_take_socket() and ends up waiting for thread A,
too (*).

(*): This is complicated: xcb_take_socket() -> _xcb_out_flush_to() ->
xcb_out_send() -> _xcb_conn_wait() -> pthread_cond_wait() on c->out.cond.

Once thread A is done writing, _xcb_out_send() broadcasts on c->out.cond, so
threads B and C will both wake up and race for the iolock.

If thread C wins this race, the socket is taken. Next, thread B will call
send_request() without calling get_socket_back() first. This is wrong, since
thread C could be calling xcb_writev() at the same time, too, and sequence
numbers can get confused.

The patch solves this problem by calling get_socket_back() after waiting for a
writer to get out of the way. Since this could drop the iolock, the whole dance
with waiting for writers has to be done again.

This is my understanding of the problem. I guess it wouldn't hurt if the commit
message mentioned what the patch is about in more detail...

So this doesn't really have to do with two concurrent callers of
xcb_take_socket(), I think. Concurrent callers will synchronize on the iolock
and the latter call will get the socket back from the first one. No problem?

Please tell me if something I wrote is wrong or incomprehensible, but this is
what I think this patch is supposed to do.

Uli

P.S.: Meh. xcb_take_socket() first calls get_socket_back(), then
_xcb_out_flush_to(). The latter function can drop the iolock and thus another
thread could have successfully called xcb_take_socket() before this call. These
races with xcb_take_socket() seem complicated enough that someone should prove
that they are no problem, or at least that suitable assert()s should be added...
-- 
Sent from my Game Boy.


More information about the Xcb mailing list