[Xcb] [PATCH 3/3] fix pthread_cond_wait and get_socket_back dependency

Christian König deathsimple at vodafone.de
Mon Jun 10 03:31:50 PDT 2013


Hi guys,

first of all sorry for not answering earlier, I've been on vacation for 
the last week.

Am 07.06.2013 21:06, schrieb Josh Triplett:
> On Fri, Jun 07, 2013 at 08:07:53PM +0200, Uli Schlachter wrote:
>> On 07.06.2013 18:27, Josh Triplett wrote:
>>> On Fri, Jun 07, 2013 at 05:11:06PM +0200, Uli Schlachter wrote:
>>>> Some thread A is currently writing (_xcb_conn_wait(), called via
>>>> _xcb_out_send()), thus c->out.writing != 0. Now thread B calls
>>>> xcb_send_request() and ends up waiting on c->out.cond until thread A is
>>>> finished. Thread C now calls xcb_take_socket() and ends up waiting for thread A,
>>>> too (*).
>>>>
>>>> (*): This is complicated: xcb_take_socket() -> _xcb_out_flush_to() ->
>>>> xcb_out_send() -> _xcb_conn_wait() -> pthread_cond_wait() on c->out.cond.
>>>>
>>>> Once thread A is done writing, _xcb_out_send() broadcasts on c->out.cond, so
>>>> threads B and C will both wake up and race for the iolock.
>>>>
>>>> If thread C wins this race, the socket is taken. Next, thread B will call
>>>> send_request() without calling get_socket_back() first. This is wrong, since
>>>> thread C could be calling xcb_writev() at the same time, too, and sequence
>>>> numbers can get confused.
>>> I don't see how this can happen the way you've described.  With thread A
>>> writing, threads B and C are both waiting to acquire the iolock.  If
>>> thread C wins, it will take the socket before it unlocks the iolock.
>>> After C drop the iolock, B will take it, and then immediately after
>>> taking the iolock B will call get_socket_back.
>> Well, no. Thread A is in _xcb_conn_wait()'s poll() (or select()) call. The last
>> thing it did before calling poll()/select() was dropping the iolock. Thus,
>> thread B can take the socket, call get_socket_back() (nothing to do here), then
>> notices that there is a writer and waits for the writer. It won't check
>> get_socket_back() again, which is what the patch changes.
> I see; missed that detail in the scenario you outlined.

Usually I don't look at specific scenarios while working on multi 
threading issues at all, the chance that you miss something is just to high.

On the other hand in this case Ulis scenario seems to be absolutely 
right and the argument that it's a prerequisite for those functions to 
both have the socket and no other writer of course still stands.

>
>>> That said, I do find myself staring at the later parts of
>>> xcb_send_request, specifically the calls to send_sync can drop the
>>> iolock; xcb_take_socket takes the iolock, but does not check
>>> c->out.writing.  Jamey?
>> I don't think so.
>>
>> xcb_take_socket() does check c->out.writing:
>> Before dropping the iolock, send_request() will have done ++c->out.request.
>> Thus, in xcb_take_socket(), the condition "c->out.request !=
>> c->out.request_written" ends up being true and this will call
>> _xcb_out_flush_to() in a loop (thanks to Keith fixing another bug here :-) ).
>> _xcb_out_flush_to() will then end up waiting for the pre-existing writer.
>>
>> So, I think there is no problem inside xcb_take_socket(), it will wait for the
>> writer. Also, the writer should(?) not have any problems, because no other
>> thread can continue before it finishes and when it finished it will do the next
>> send_request() call again before dropping the iolock.
> That *sounds* right...
>
>>>> This is my understanding of the problem. I guess it wouldn't hurt if the commit
>>>> message mentioned what the patch is about in more detail...
>>> Agreed.
>>>
>>>> P.S.: Meh. xcb_take_socket() first calls get_socket_back(), then
>>>> _xcb_out_flush_to(). The latter function can drop the iolock and thus another
>>>> thread could have successfully called xcb_take_socket() before this call. These
>>>> races with xcb_take_socket() seem complicated enough that someone should prove
>>>> that they are no problem, or at least that suitable assert()s should be added...
>>> Yeah, as mentioned above, we need to stare at the lock-dropping cases
>>> more closely, both those in the xcb_take_socket path and those in the
>>> xcb_send_request path, to make sure they exclude each other.
>> Perhaps it would be best to have some wrapper functions for acquiring/releasing
>> the iolock (or rather: acquiring the socket?). During acquire, the wrapper would
>> also make sure that writers get out of the way and that the socket is not taken.
>> The release wrapper would just be there for consistency.
>>
>> Hopefully these could be used everywhere and thus we have to worry about less
>> special cases. This would only leave doubts about code which continues after it
>> wrote to the socket (and thus possibly dropped the iolock)...
> That sounds like a good idea.  Likewise we'd need a wrapper for waiting
> on a condition variable, since that drops the lock.

Yes, that indeed sounds like a good idea. Something like prepare_for_write?

Christian.


More information about the Xcb mailing list