[Mesa-dev] Performance glxSwapBuffers 32 bit vs. 64 bit

Theiss, Ingo ingo.theiss at i-matrixx.de
Thu Nov 10 02:01:14 PST 2011


Hi Michel,

thanks for the reply and your suggestions.

It took me a while to figure out how to use and run oprofile but finally I was able to produce some hopefully useable output.

The function calls of mesa/state_tracker/st_cb_readpixels.c:382 -> st_readpixels and mesa/main/pack.c:552 -> _mesa_pack_rgba_span_float clearly stands out when comparing the 32 bit and 64 bit profile.

You can take a look at the complete reports and callgraph images at:

http://www.i-matrixx.de/oreport_glxspheres64.txt
https://www.i-matrixx.de/oprofile_glxspheres64.png

https://www.i-matrixx.de/oreport_glxspheres32.txt
https://www.i-matrixx.de/oprofile_glxspheres32.png

I hope this helps to find the cause and improve the driver.
To sad I have no knowledge in C programming this is getting interesting. 

Let me know if you need anything else.

Thanks for your time.

Regards,

Ingo
 
Am Montag, 07. November 2011 16:10 CET, Michel Dänzer <michel at daenzer.net> schrieb: 
 
> On Fre, 2011-11-04 at 13:38 +0100, Theiss, Ingo wrote:
> > 
> > I am using VirtualGL (http://www.virtualgl.org) for full 3D hardware
> > accelerated remote OpenGL applications with latest mesa from git
> > (compiled for both 32 bit and 64 bit) on my 64 bit Debian Wheezy box.
> > 
> > When I run a 32 bit application with VirtualGL I suffer nearly 50%

> > performance drop compared when running the same 64 bit application

> > with virtualGL. In the first place I have contacted the VirtualGL
> > developer and he said that the performance drop is not a VirtualGL

> > problem but related to the underlying 3D driver. The performance drop
> > seems related to the function glxSwapBuffers which can be seen in the
> > function call tracing of VirtualGL:
> > 
> > 64 bit application with VirtualGL
> > -------------------------------------
> > [VGL] glXSwapBuffers (dpy=0x00deb900(:0) drawable=0x00a00002 pbw->getglxdrawable()=0x00800002 ) 28.770924 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.005960 ms
> > [VGL] glViewport (x=0 y=0 width=1240 height=900 ) 0.003099 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.002861 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.002861 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.000000 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.000954 ms
> > [VGL] glXSwapBuffers (dpy=0x00deb900(:0) drawable=0x00a00002 pbw->getglxdrawable()=0x00800002 ) 29.365063 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.006914 ms
> > 
> > 32 bit application with VirtualGL
> > -------------------------------------
> > [VGL] glXSwapBuffers (dpy=0x087f7458(:0.0) drawable=0x00a00002 pbw->getglxdrawable()=0x00800002 ) 65.419075 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.005930 ms
> > [VGL] glViewport (x=0 y=0 width=1240 height=900 ) 0.003049 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.002989 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.004064 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.001051 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.001044 ms
> > [VGL] glXSwapBuffers (dpy=0x087f7458(:0.0) drawable=0x00a00002 pbw->getglxdrawable()=0x00800002 ) 65.005891 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.004926 ms
> > 
> > 
> > Is this performance drop a normal or expected behaviour when running a
> > 32 bit application on 64 bit OS or some kind of "bug"?
> 
> Probably the latter. You should try to find out where the time is spent
> inside glXSwapBuffers in both cases. If the function is (at least
> roughly) CPU bound, this should be relatively easy with a profiler such
> as sysprof, perf or oprofile. 
> 
> 
> -- 
> Earthling Michel Dänzer           |                   http://www.amd.com
> Libre software enthusiast         |          Debian, X and DRI developer
> 



More information about the mesa-dev mailing list