[Xcb] LBX + Performance

Jamey Sharp jamey at minilop.net
Sat Jun 4 19:09:06 PDT 2005


On Sat, 2005-06-04 at 17:30 -0700, John Davidorff Pell wrote:
> I was just reading An LBX Postmortem [1] and was curious: Can/Does  
> XCB implement any methods to improve performance over high-latency  
> links? For example, can/does it combine adjacent requests for  
> transport? I don't know much about LBX, but perhaps some of its  
> techniques should be incorporated directly into XCB.

If you're interested in this topic, you should also read Keith's more
recent paper, "X Window System Network Performance".
	http://keithp.com/~keithp/talks/usenix2003/
As Keith has been one of the people shaping XCB's design, performance
has been a consideration from the beginning.

I, too, don't know much about LBX. :-) I've looked at some related tools
that also tried to improve on Xremote and that I gather have inspired
the recent work of the NoMachine folks; I think "FHBX" was one such. XCB
doesn't do anything like what those tools do, as they focus on fancy
data compression algorithms. XCB may someday do those things, if they
actually improve performance, but in my opinion implementing them now
would be premature.

One of XCB's big features, though, is its latency-hiding mechanism.
Unlike Xlib, when an XCB-using application makes a request, it doesn't
automatically block until the reply arrives; instead it can make more
requests or do other processing until it actually needs that reply. I
once analyzed the start-up behavior of xterm and concluded that by
re-writing with XCB, xterm could get dramatically faster start-up times.
I think the numbers were like 100 round-trip times on Xlib but about
five on XCB. Whatever the numbers, it was a big difference, particularly
if you consider the 200ms or more of round-trip delay on a dial-up modem
link: that's 20 seconds before the main window is even created on Xlib,
and only 1 second on XCB.

In fact, all Xlib-based applications have a minimum of five round-trip
delays on start-up (assuming XKB and BIG-REQUESTS are present) because
Xlib automatically caches certain values, even though most applications
don't use them. Compare XCB, which makes no requests unless it has to.

There is a small bandwidth optimization that XCB used to support, which
was to combine adjacent requests if they had the same parameters -- same
request type (e.g. PolySegment), drawable, graphics context, etc. XCB's
implementation has evolved toward increased modularity, which has made
that particular optimization difficult to support, but I intend to find
a way to make it work as it could make a big difference for the RENDER
and GLX extensions.

I could be forgetting some other optimization that's in there, but
nothing is coming to mind. I think the above two things are it -- but
latency-hiding is a big deal, and once you have it, tunneling your X
connection through SSH with compression turned on should get you close
to optimal performance, and using Xnest as a proxy near your clients can
help too.

I hope that answers your question.

--Jamey



More information about the xcb mailing list