[Xcb] profiling and performance

Jamey Sharp jamey at minilop.net
Wed May 3 10:55:26 PDT 2006

On Wed, May 03, 2006 at 06:58:56PM +0200, Vincent Torri wrote:
> Hey,

Hi Vincent!

Hey, it just occurred to me: would you retest Xlib with the environment
variable XLIBBUFFERSIZE set to 4? If its performance drops to match
XCB's then we'll know it's XCB's smaller output buffer that's the cause
of the performance difference.

> ha, right. I compile XCB with no optimisation flags and -g. Maybe that can
> change something a bit.

Yeah. :-) You still need -g to get useful output from oprofile, though,
and it won't have any effect on performance -- it's just putting a
little more data into the compiled library.

> I usually use for my computer these flags :
> -march=athlon-xp -O3 -ffast-math -pipe -funroll-loops
> -fomit-frame-pointer -msse -mfpmath=sse,387
> Do you find them reasonnable ?

I suppose XCB might get a little bit of a boost from -march and -msse,
but the other stuff can probably all be replaced with -O2 at no
measureable difference. I think it should all be harmless though.

> ok. Then I need to think a LOT in order to integrate the loop in a thread,
> as i don't know at all how to do that :)
> This can be a hard part for later, as ecore is not threaded at all. It's
> not thread safe at all.

That's fine. Event polling was a pretty small part of the profile
anyway. If you get to a point where that last 0.03% matters to you, you
could probably reduce the frequency of polling to once every few frames.

> > Understanding when you can remove XCBSync calls is hard though.
> Oulaaa, I don't know at all when removing them :D I know that there is a
> call that I can remove. That's all.
> Same for XCBFlush ?

Flush has a small performance impact, but not as much as Sync. Sync
forces a round-trip/context switch, blocking your application until all
the work it has asked for is done. Flush just causes a system call,
which might still trigger a context switch, but the client remains
ready-to-run. Because the client blocks during a sync, it could have a
big effect on throughput (== framerate) without having much effect on
CPU usage or the profiling results.

Sync actually should almost never be necessary. But I don't have any
general rules for how to tell when you need it. Just double-check that
your XCB test isn't doing Syncs in more places than the Xlib version.

It occurs to me that for XCBGetInputFocus and XCBGetInputFocusReply to
be showing up in the profile at all, you must be calling XCBSync really
often. I wouldn't be surprised if both versions of the test could get a
performance boost while remaining correct by removing a bunch of syncs.

> After that, i'll try to compare xlib and xcb. Then i'll give other numbers
> :D

Yay! More numbers! :-)

> thank you for all those informations

No problem. I don't want anyone saying, "I'm not switching to XCB
because it's slow." :-)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: Digital signature
Url : http://lists.freedesktop.org/archives/xcb/attachments/20060503/c860ab29/attachment.pgp

More information about the Xcb mailing list