[Xcb] DRI2GetBuffersWithFormat hangs waiting for X.org

Joris Dobbelsteen joris.dobbelsteen at sioux.eu
Wed Dec 1 07:23:31 PST 2010


On Wed, 2010-12-01 at 12:58 +0200, Pauli Nieminen wrote:
> On Tue, Nov 30, 2010 at 6:22 PM, Joris Dobbelsteen
> <joris.dobbelsteen at sioux.eu> wrote:
> > Hi all,
> >
> > I'm looking for some expert opinion to verify a possible cause of our
> > problems.
> >
> > After further investigation of the issue the following is found:
> > We do all our X drawing on a single thread.
> > However, DirectFB (used due to legacy reasons) creates a second thread
> > for reading inputs (events).
> >
> > Could this multi-threaded usage of libX11 cause it to hang?
> >
> > What we see is that our drawing thread gets stuck in a poll on the
> > socket while calling _XReply (after _XReply has called _XSend). However,
> > a second thread is seen reading from a socket for inputs.
> >
> 
> Does someone set the connection to thread safe mode?
> 
> If yes then xcb might be causing the problem.

The issue was found to be in xcb.

I've opened a discussion under subject "Xlib/XCB in multi-threaded
situation results in deadlock" to discuss the problem. We have a small
workaround in place that works for us.

- Joris

> > Thanks,
> >
> > - Joris Dobbelsteen
> >
> > On Fri, 2010-11-26 at 18:14 +0100, Joris Dobbelsteen wrote:
> >> Hi all,
> >>
> >> I'm having an application which is hanging after a call to
> >> DRI2GetBuffersWithFormat in mesa.
> >>
> >> I am looking for support with the issue and how to diagnose it.
> >>
> >> The issue is that our application is hanging at some point waiting for
> >> the X server to respond.
> >> One workaround is to move the mouse, which seems to generate events and
> >> that makes the communication get back on track.
> >>
> >> The issue has been very consistent, occurring at most 5 minutes after
> >> the x server is started and usually sooner. The issue does not occur in
> >> Ubuntu 10.10, which means there is a difference somewhere.
> >>
> >> The stack trace for the X client we are consistently getting is:
> >> #0  0xb7741424 in __kernel_vsyscall ()
> >> #1  0xb6d269b4 in pthread_cond_wait () from /lib/libpthread.so.0
> >> #2  0xb6a03344 in ?? () from /usr/lib/libX11.so.6
> >> #3  0xb6a02d2f in ?? () from /usr/lib/libX11.so.6
> >> #4  0xb6a1e0d3 in _XReply () from /usr/lib/libX11.so.6
> >> #5  0xb5de69a3 in DRI2GetBuffersWithFormat (dpy=0x8397fa0,
> >> drawable=10485772, width=0x88dd5a4, height=0x88dd5a8,
> >> attachments=0xbfaf0098, count=1,
> >>      outCount=0xbfaf00c4) at dri2.c:454
> >> #6  0xb5de478f in dri2GetBuffersWithFormat (driDrawable=0x88dd580,
> >> width=0x88dd5a4, height=0x88dd5a8, attachments=0xbfaf0098, count=1,
> >> out_count=0xbfaf00c4,
> >>      loaderPrivate=0x88dd4e0) at dri2_glx.c:582
> >> #7  0xb5a7d3df in intel_update_renderbuffers (context=0x835f260,
> >> drawable=0x88dd580) at intel_context.c:290
> >> #8  0xb5a7d8cc in intel_prepare_render (intel=0x83997c0) at
> >> intel_context.c:438
> >> #9  0xb5a7a99d in i915_render_start (intel=0x83997c0) at i915_vtbl.c:58
> >> #10 0xb5a8ed9b in intelRenderStart (ctx=0x83997c0) at intel_tris.c:1087
> >> #11 0xb5b888b3 in run_render (ctx=0x83997c0, stage=0x8607d88) at
> >> tnl/t_vb_render.c:276
> >> #12 0xb5b7cb84 in _tnl_run_pipeline (ctx=0x83997c0) at tnl/t_pipeline.c:153
> >> #13 0xb5a8f07d in intelRunPipeline (ctx=0x83997c0) at intel_tris.c:1075
> >> #14 0xb5b7d284 in _tnl_draw_prims (ctx=0x83997c0, arrays=0x85f5e1c,
> >> prim=0x85f47f0, nr_prims=1, ib=0x0, min_index=0, max_index=3) at
> >> tnl/t_draw.c:478
> >> #15 0xb5b7dd86 in _tnl_vbo_draw_prims (ctx=0x83997c0, arrays=0x85f5e1c,
> >> prim=0x85f47f0, nr_prims=1, ib=0x0, index_bounds_valid=1 '\001',
> >> min_index=0,
> >>      max_index=3) at tnl/t_draw.c:384
> >> #16 0xb5b75667 in vbo_exec_vtx_flush (exec=0x85f46b8, unmap=1 '\001') at
> >> vbo/vbo_exec_draw.c:384
> >> #17 0xb5b72560 in vbo_exec_FlushVertices_internal (ctx=0x83997c0,
> >> unmap=0 '\0') at vbo/vbo_exec_api.c:876
> >> #18 0xb5b72638 in vbo_exec_FlushVertices (ctx=0x83997c0, flags=1) at
> >> vbo/vbo_exec_api.c:910
> >> #19 0xb5b4bb79 in _mesa_BindTexture (target=34037, texName=19) at
> >> main/texobj.c:1058
> >> #20 0xb5df80b1 in glBindTexture (target=34037, texture=19) at
> >> ../../../src/mapi/glapi/glapitemp.h:1627
> >> #21 0xb4a10e44 in glSetState () from
> >> /usr/lib/directfb-1.4-5-pure/gfxdrivers/libdirectfb_gl.so
> >> #22 0xb6dc3ee1 in ?? () from /usr/lib/libdirectfb-1.4.so.5
> >> #23 0xb6dc698c in dfb_gfxcard_blit () from /usr/lib/libdirectfb-1.4.so.5
> >> #24 0xb6d78679 in ?? () from /usr/lib/libdirectfb-1.4.so.5
> >> #25 0xb6e01825 in IDirectFBSurface::Blit () from /usr/lib/lib++dfb-1.0.so.0
> >>
> >> I see the X server getting a request and sending a reply. After this
> >> it's back in the waiting for command state:
> >>
> >> #0  0xb76fb424 in __kernel_vsyscall ()
> >> #1  0xb73c07cd in select () from /lib/libc.so.6
> >> #2  0x0809bb28 in WaitForSomething ()
> >> #3  0x0806e0ae in ?? ()
> >> #4  0x092b0da8 in ?? ()
> >> #5  0x00000002 in ?? ()
> >> #6  0x00000000 in ?? ()
> >>
> >> With DEBUG_COMMUNICATION in os/io.c defined, I see the following output:
> >> [snip]
> >> REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x161b
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x161c
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x161d
> >> REQUEST: ClientIDX: 6, type: 0x3e data: 0x7 len: 7
> >> REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x161e
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x161f
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x1620
> >> REQUEST: ClientIDX: 6, type: 0x3e data: 0x7 len: 7
> >> REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x1621
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x1622
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x1623
> >> REQUEST: ClientIDX: 6, type: 0x3e data: 0x7 len: 7
> >> REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x1624
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x1625
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x1626
> >> REQUEST: ClientIDX: 6, type: 0x3e data: 0x7 len: 7
> >> REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x1627
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x1628
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x1629
> >> REQUEST: ClientIDX: 6, type: 0x3e data: 0x7 len: 7
> >> REPLY: ClientIDX: 6 XEvent: type: 0xe detail: 0x0 seq#: 0x162a
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0xe1 len: 5 seq#: 0x162b
> >> REQUEST: ClientIDX: 6, type: 0x86 data: 0x7 len: 5
> >> REPLY: ClientIDX: 6 Xreply: type: 0x1 data: 0x63 len: 5 seq#: 0x162c
> >>
> >> It keeps waiting there till I move the mouse and see more REPLY
> >> responses in a line.
> >>
> >> There is the faint impression that the xserver does something wrong with
> >> buffering the message, and not flushing it. What is the right approach
> >> to see when the message is actually written to the socket?
> >> There seems to be quite a lot of cleverness around writing to the
> >> socket, probably for performance reasons.
> >>
> >>
> >> Note: this is the same issue as:
> >> <http://lists.freedesktop.org/archives/intel-gfx/2010-November/008850.html>
> >>
> >> Used software (somewhat updated in the meanwhile):
> >> * Linux 2.6.37-rc3, but same issue with 2.6.36 (and I think with 2.5.35
> >> as well, but I'm not completely sure).
> >> * mesa 7.9.
> >> * xf86-video-intel 2.12.0, 2.13.0, 2.13.901.
> >> * libdrm 2.4.22 (and today's trunk, as it had a lot of intel patches).
> >>
> >> The issue looks very much the same as:
> >> <http://www.pubbs.net/201003/xorg/2227-problem-using-an-mesa-based-app-with-recent-xorgmesaxf86-video-intelloop.html>
> >>
> >> Thanks in advance,
> >>
> >> - Joris
> >>
> >> _______________________________________________
> >> xorg-devel at lists.x.org: X.Org development
> >> Archives: http://lists.x.org/archives/xorg-devel
> >> Info: http://lists.x.org/mailman/listinfo/xorg-devel
> >>
> >
> >
> > _______________________________________________
> > xorg-devel at lists.x.org: X.Org development
> > Archives: http://lists.x.org/archives/xorg-devel
> > Info: http://lists.x.org/mailman/listinfo/xorg-devel
> >
> 




More information about the Xcb mailing list