[Xcb] xcb recursive write not protected in all cases?

Keith Packard keithp at keithp.com
Mon Dec 23 17:00:46 PST 2013

I'm reading a backtrace provided by an Intel business partner and found
the following:

   #0  0xf73b81c5 in raise () from /lib/i386-linux-gnu/libc.so.6
   #1  0xf73bb663 in abort () from /lib/i386-linux-gnu/libc.so.6
   #2  0xf73b10c7 in ?? () from /lib/i386-linux-gnu/libc.so.6
   #3  0xf73b1177 in __assert_fail () from /lib/i386-linux-gnu/libc.so.6
   #4  0xf779992d in write_vec (count=0xb8ab49f8, vector=0xb8ab49f4, 
	at ../../src/xcb_conn.c:180
   #5  _xcb_conn_wait (c=c at entry=0xc309ce0, cond=cond at entry=0xc30ad74,
	vector=vector at entry=0xb8ab49f4, count=count at entry=0xb8ab49f8)
	at ../../src/xcb_conn.c:443
   #6  0xf7799d38 in _xcb_out_send (c=c at entry=0xc309ce0,
	vector=vector at entry=0xb8ab4a18, count=count at entry=1)
	at ../../src/xcb_out.c:345
   #7  0xf779a650 in _xcb_out_flush_to (c=0xc309ce0, request=18557)
	at ../../src/xcb_out.c:372
   #8  0xf779a758 in xcb_take_socket (c=0xc309ce0,
	return_socket=0xf7693120 <return_socket>, closure=0xc339090, flags=0,
	sent=0xb8ab4ab8) at ../../src/xcb_out.c:271
   #9  0xf76929e6 in require_socket (dpy=0xc339090) at ../../src/xcb_io.c:67
   #10 0xf769348f in require_socket (dpy=0xc339090) at ../../src/xcb_io.c:59
   #11 _XFlush (dpy=0xc339090) at ../../src/xcb_io.c:510
   #12 0xf7696258 in _XGetRequest (dpy=0xc339090, type=43 '+', len=4)
	at ../../src/XlibInt.c:1973
   #13 0xf768eeeb in XSync (dpy=0xc339090, discard=discard at entry=0)
	at ../../src/Sync.c:43
   #14 0xf6d6fe82 in dri2XcbSwapBuffers (remainder=0, divisor=0, target_msc=0,
	dpy=<optimized out>, pdraw=<optimized out>)
	at ../../../../src/glx/dri2_glx.c:803
   #15 dri2SwapBuffers (pdraw=0x27ca6c20, target_msc=0, divisor=0, 
	flush=1) at ../../../../src/glx/dri2_glx.c:841
   #16 0xf6d43c9e in glXSwapBuffers (dpy=0xc339090, drawable=16777242)
	at ../../../../src/glx/glxcmds.c:836

The assertion failure is from:

        /* precondition: there must be something for us to write. */
        static int write_vec(xcb_connection_t *c, struct iovec **vector, int *count)
            int n;

At first blush, this assertion is ridiculous -- write_vec doesn't deal
with the out.queue at all; it's dealing with an array of
iovecs. However, what it's *really* checking is that no other thread has
jumped in an written data over the top of the out.queue, which is
generally one of the iovec members. Writing out.queue to the socket
looks like:

    vector[0].iov_base = c->out.queue;
    vector[0].iov_len = c->out.queue_len;
    c->out.queue_len = 0;
    _xcb_out_send(c, vector, count);

So, if some other thread gets between the assignment setting queue_len
to 0 and the eventual write_vec call, it will end up smashing data over
the top of out.queue. We can tell that happened because queue_len will
be non-zero again.

Ok, so for this assertion to fail, I just need to find a place in the
code between _xcb_out_send and write_vec which might drop iolock and let
some other thread sneak in and smash the buffer.

_queue_len = 0
                        assert (queue_len == 0)

Well, that wasn't hard -- _xcb_conn_wait drops the lock to wait for the
connection to be ready to receive data. Of course, it sets writing at
the same time, and so we need to make sure everyone using out.queue
blocks waiting for writing to go to zero before it touches the buffer.

Fortunately, there's only one place that adds data to out.queue and
bumps out.queue_len, and that's in send_request. send_request itself
does not check out.writing, and so we need to look in the callers of that:

                        queue_len += iov_len;

send_sync also doesn't check for out.writing, so we need to look at the
callers of that:

Both _xcb_out_send_sync and xcb_send_request *do* check out.writing, so
we can stop looking up the stack.

_xcb_out_send_sync looks like this:

        void _xcb_out_send_sync(xcb_connection_t *c)
            /* wait for other writing threads to get out of my way. */
                pthread_cond_wait(&c->out.cond, &c->iolock);

As you can see, it makes sure that out.writing is zero at the top of the
function. This will keep it from continuing while some other thread is
down in write_vec writing out.queue to the network. However, the very
next thing that it does is get_socket_back(c). And, get_socket_back can
drop iolock a couple of times, which can let some other thread sneak in
and write a request to out.queue.

Similarly, xcb_send_request also loops waiting for out.writing to go to
zero, but it too calls get_socket_back. It can also call send_sync for a
couple of reasons.

I think we need to check out.writing closer to the actual use of the
out.queue; perhaps it would suffice to check that down inside
send_request itself? That would certainly offer the surest guarantee
that iolock wasn't dropped between the check for out.writing and the use
of out.queue.

keith.packard at intel.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/xcb/attachments/20131223/3b253b2e/attachment.pgp>

More information about the Xcb mailing list