[Xcb] [PATCH] fix deadlock with xcb_take_socket/return_socket v2

Christian König deathsimple at vodafone.de
Thu May 16 02:08:20 PDT 2013


Am 16.05.2013 00:05, schrieb Jamey Sharp:
> [SNIP]
>
>> Yeah but nobody prevents other XCB users (like the OpenGL and/or
>> VDPAU implementation in mesa) to be called with the display-lock
>> held.
> Nothing prevents it except the fact that the callers won't work if they
> try it. If you want to call XCB-based API, make sure Xlib is not holding
> the internal Display lock on the current thread.

Well then please tell me how a library that links to XCB and NOT to 
libX11 should make sure of that?

My problem is that VDPAU (or OpenGL) should be completely independent, 
so application developers using VDPAU don't see any reason why they 
should drop the display lock around VDPAU calls. So it is indeed 
possible and actually quite likely that xcb functions get called with 
the display lock held.

So far I only see three possible solutions:
1. VDPAU uses libX11 instead of XCB (which kind of sucks badly).
2. We fix all applications all around the world to not call VDPAU with 
the X display lock held.
3. We fix XCB to allow calling XCB function with the displaylock held.

While solution 1 and 2 are rather unlikely, solution 3 is actually 
pretty simple to implement.

> That's the rule for xcb_take_socket. If you want a different rule, we
> should have a different discussion. :-)

Well the problem is that we force any caller of any xcb function to make 
sure that the X display lock isn't held here, and so creating an 
dependency to libX11. If that's really true than we should indeed 
document that clearly, but I really think that this XCB behavior is not 
intentional.

> Note that although there are lots of mentions of GL and VDPAU in #20708,
> none of the test programs call anything outside core libX11. I don't see
> how #20708 is about whatever bug you're seeing.

Because that's the original reason why Thomas created the bugreport. 
Please see the discussion on the mesa list here: 
http://old.nabble.com/forum/ViewPost.jtp?post=22548486&framed=y

>
> [SNIP]
> I've just re-read the comments and all three test programs in #20708,
> and I don't see how any of them are locking-order deadlocks. You say
> this is "clear"; could you demonstrate how, please?

Ok, sorry maybe not so clear if you don't know the initial discussion on 
the mesa list.

> The second test program demonstrated that we'd broken Xlib's crazy
> nested-lock implementation. Uli and I both wrote equivalent patches to
> fix that, and I closed the bug.
>
> Reinhard re-opened with a different bug, which most likely was being
> masked by the earlier bug. Although that bug's stack traces aren't
> clearly a result of #30450, it still seems like the most plausible
> candidate. Please help me understand why that's not true.

That's the reason why I didn't reopened the bug report, cause I think 
that those bug reports have seen more than enough confusion for now.

Anyway the current still open bug reports are definitely not about the 
problem I'm trying to fix with that patch, but I think from the mesa 
list discussion that this is the bug Thomas initially ran into.

>> Currently my testcase only triggers the deadlock fixed with this
>> patch (pretty complicated stuff involving XBMC, VDPAU and mesas
>> OpenGL implementation). But I'm pretty sure that once I get this one
>> upstream we are going to run into the other one also.
> Do you have a minimal test case you can share?

Uli come up with this example, and it indeed seems to describe the 
problem consistently:

     - X11 has taken the socket
     - Thread A has locked the display.
     - Thread B does xcb_no_operation() and thus ends up in libX11's 
return_socket(),
       waiting for the display lock.
     - Thread A calls e.g. xcb_no_operation(), too, ends up in 
return_socket() and
       because socket_moving == 1, ends up waiting for thread B

     => Deadlock



Christian.

>
> Thanks for taking the time to investigate this,
> Jamey
>
>
> _______________________________________________
> Xcb mailing list
> Xcb at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/xcb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/xcb/attachments/20130516/66f20b89/attachment-0001.html>


More information about the Xcb mailing list