[Mesa-dev] Performance glxSwapBuffers 32 bit vs. 64 bit
Theiss, Ingo
ingo.theiss at i-matrixx.de
Thu Nov 10 02:01:14 PST 2011
Hi Michel,
thanks for the reply and your suggestions.
It took me a while to figure out how to use and run oprofile but finally I was able to produce some hopefully useable output.
The function calls of mesa/state_tracker/st_cb_readpixels.c:382 -> st_readpixels and mesa/main/pack.c:552 -> _mesa_pack_rgba_span_float clearly stands out when comparing the 32 bit and 64 bit profile.
You can take a look at the complete reports and callgraph images at:
http://www.i-matrixx.de/oreport_glxspheres64.txt
https://www.i-matrixx.de/oprofile_glxspheres64.png
https://www.i-matrixx.de/oreport_glxspheres32.txt
https://www.i-matrixx.de/oprofile_glxspheres32.png
I hope this helps to find the cause and improve the driver.
To sad I have no knowledge in C programming this is getting interesting.
Let me know if you need anything else.
Thanks for your time.
Regards,
Ingo
Am Montag, 07. November 2011 16:10 CET, Michel Dänzer <michel at daenzer.net> schrieb:
> On Fre, 2011-11-04 at 13:38 +0100, Theiss, Ingo wrote:
> >
> > I am using VirtualGL (http://www.virtualgl.org) for full 3D hardware
> > accelerated remote OpenGL applications with latest mesa from git
> > (compiled for both 32 bit and 64 bit) on my 64 bit Debian Wheezy box.
> >
> > When I run a 32 bit application with VirtualGL I suffer nearly 50%
> > performance drop compared when running the same 64 bit application
> > with virtualGL. In the first place I have contacted the VirtualGL
> > developer and he said that the performance drop is not a VirtualGL
> > problem but related to the underlying 3D driver. The performance drop
> > seems related to the function glxSwapBuffers which can be seen in the
> > function call tracing of VirtualGL:
> >
> > 64 bit application with VirtualGL
> > -------------------------------------
> > [VGL] glXSwapBuffers (dpy=0x00deb900(:0) drawable=0x00a00002 pbw->getglxdrawable()=0x00800002 ) 28.770924 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.005960 ms
> > [VGL] glViewport (x=0 y=0 width=1240 height=900 ) 0.003099 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.002861 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.002861 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.000000 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.000954 ms
> > [VGL] glXSwapBuffers (dpy=0x00deb900(:0) drawable=0x00a00002 pbw->getglxdrawable()=0x00800002 ) 29.365063 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.006914 ms
> >
> > 32 bit application with VirtualGL
> > -------------------------------------
> > [VGL] glXSwapBuffers (dpy=0x087f7458(:0.0) drawable=0x00a00002 pbw->getglxdrawable()=0x00800002 ) 65.419075 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.005930 ms
> > [VGL] glViewport (x=0 y=0 width=1240 height=900 ) 0.003049 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.002989 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.004064 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.001051 ms
> > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.001044 ms
> > [VGL] glXSwapBuffers (dpy=0x087f7458(:0.0) drawable=0x00a00002 pbw->getglxdrawable()=0x00800002 ) 65.005891 ms
> > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 pbw->getglxdrawable()=0x00800002 ) 0.004926 ms
> >
> >
> > Is this performance drop a normal or expected behaviour when running a
> > 32 bit application on 64 bit OS or some kind of "bug"?
>
> Probably the latter. You should try to find out where the time is spent
> inside glXSwapBuffers in both cases. If the function is (at least
> roughly) CPU bound, this should be relatively easy with a profiler such
> as sysprof, perf or oprofile.
>
>
> --
> Earthling Michel Dänzer | http://www.amd.com
> Libre software enthusiast | Debian, X and DRI developer
>
More information about the mesa-dev
mailing list