[Xcb] [PATCH 3/3] fix pthread_cond_wait and get_socket_back dependency
Josh Triplett
josh at joshtriplett.org
Fri Jun 7 09:27:37 PDT 2013
On Fri, Jun 07, 2013 at 05:11:06PM +0200, Uli Schlachter wrote:
> Some thread A is currently writing (_xcb_conn_wait(), called via
> _xcb_out_send()), thus c->out.writing != 0. Now thread B calls
> xcb_send_request() and ends up waiting on c->out.cond until thread A is
> finished. Thread C now calls xcb_take_socket() and ends up waiting for thread A,
> too (*).
>
> (*): This is complicated: xcb_take_socket() -> _xcb_out_flush_to() ->
> xcb_out_send() -> _xcb_conn_wait() -> pthread_cond_wait() on c->out.cond.
>
> Once thread A is done writing, _xcb_out_send() broadcasts on c->out.cond, so
> threads B and C will both wake up and race for the iolock.
>
> If thread C wins this race, the socket is taken. Next, thread B will call
> send_request() without calling get_socket_back() first. This is wrong, since
> thread C could be calling xcb_writev() at the same time, too, and sequence
> numbers can get confused.
I don't see how this can happen the way you've described. With thread A
writing, threads B and C are both waiting to acquire the iolock. If
thread C wins, it will take the socket before it unlocks the iolock.
After C drop the iolock, B will take it, and then immediately after
taking the iolock B will call get_socket_back.
That said, I do find myself staring at the later parts of
xcb_send_request, specifically the calls to send_sync can drop the
iolock; xcb_take_socket takes the iolock, but does not check
c->out.writing. Jamey?
> The patch solves this problem by calling get_socket_back() after waiting for a
> writer to get out of the way. Since this could drop the iolock, the whole dance
> with waiting for writers has to be done again.
Only because of patch 2 of this series, which was not around when we
tried to work out the correctness of xcb_take_socket. :)
> This is my understanding of the problem. I guess it wouldn't hurt if the commit
> message mentioned what the patch is about in more detail...
Agreed.
> P.S.: Meh. xcb_take_socket() first calls get_socket_back(), then
> _xcb_out_flush_to(). The latter function can drop the iolock and thus another
> thread could have successfully called xcb_take_socket() before this call. These
> races with xcb_take_socket() seem complicated enough that someone should prove
> that they are no problem, or at least that suitable assert()s should be added...
Yeah, as mentioned above, we need to stare at the lock-dropping cases
more closely, both those in the xcb_take_socket path and those in the
xcb_send_request path, to make sure they exclude each other.
- Josh Triplett
More information about the Xcb
mailing list