[Mesa-dev] Performance glxSwapBuffers 32 bit vs. 64 bit

Theiss, Ingo ingo.theiss at i-matrixx.de
Fri Nov 11 05:15:35 PST 2011


Am Freitag, 11. November 2011 12:09 CET, Michel Dänzer <michel at daenzer.net> schrieb: 

> So It makes sense to find a glReadPixels in VirtualGL's glxSwapBuffers.
> 
> Ah. I thought the time measurements in Ingo's original post were for the
> Mesa glXSwapBuffers, not the VirtualGL one. If it's the latter, then

> this makes sense.
> 
> Ingo, I noticed that your 64-bit and 32-bit drivers were built from
> slightly different Git snapshots. Is the problem still the same if you
> build both from the same, current snapshot?
> 
> If yes, have you compared the compiler flags that end up being used in
> both cases? E.g., in 64-bit mode SSE is always available, so there might
> be some auto-vectorization going on in that case.

I´ve rebuild my 64-bit and 32-bit drivers from a fresh Git snapshot and turned on all processor optimizations in both builds.
But nevertheless the readback performance measured inside VirtualGL is only half of the 64-bit readback performance and of course the rendered window sceene is noticeable slower to :-(

Here are the compiler flags used.

32-bit:

CFLAGS: -O2 -Wall -g -m32 -march=amdfam10 -mtune=amdfam10 -fno-omit-frame-pointer -Wall -Wmissing-prototypes -std=c99 -ffast-math -fno-strict-aliasing -fno-builtin-memcmp -m32 -O2 -Wall -g -m32 -march=amdfam10 -mtune=amdfam10 -fno-omit -frame-pointer -fPIC -m32

CXXFLAGS: -O2 -Wall -g -m32 -march=amdfam10 -mtune=amdfam10 -Wall -fno-strict-aliasing -fno-builtin-memcmp -m32 -O2 -Wall -g -m32 -march=amdfam10 -mtune=amdfam10 -fPIC -m32

Macros: -D_GNU_SOURCE -DPTHREADS -DTEXTURE_FLOAT_ENABLED -DHAVE_POSIX_MEMALIGN -DUSE_XCB -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_TLS -DPTHREADS -DUSE_EXTERNAL_DXTN_LIB=1 -DIN_DRI_DRIVER -DHAVE_ALIAS -DHAVE_MINCORE -DHAVE_LIBUDEV -DHAVE_XCB_DRI2 -DXCB_DRI2_CONNECT_DEVICE_NAME_BROKEN -D__STDC_CONSTANT_MACROS -DUSE_X86_ASM -DUSE_MMX_ASM -DUSE_3DNOW_ASM -DUSE_SSE_ASM

64-bit:

CFLAGS: -O2 -Wall -g -march=amdfam10 -mtune=amdfam10 -fno-omit-frame-pointer -Wall -Wmissing-prototypes -std=c99 -ffast-math -fno-strict-aliasing -fno-builtin-memcmp -m64 -O2 -Wall -g -march=amdfam10 -mtune=amdfam10 -fno-omit-frame-pointer -fPIC

CXXFLAGS: -O2 -Wall -g -march=amdfam10 -mtune=amdfam10 -Wall -fno-strict-aliasing -fno-builtin-memcmp -m64 -O2 -Wall -g -march=amdfam10 -mtune=amdfam10 -fPIC

Macros: -D_GNU_SOURCE -DPTHREADS -DTEXTURE_FLOAT_ENABLED -DHAVE_POSIX_MEMALIGN -DUSE_XCB -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_TLS -DPTHREADS -DUSE_EXTERNAL_DXTN_LIB=1 -DIN_DRI_DRIVER -DHAVE_ALIAS -DHAVE_MINCORE -DHAVE_LIBUDEV -DHAVE_XCB_DRI2 -DXCB_DRI2_CONNECT_DEVICE_NAME_BROKEN -D__STDC_CONSTANT_MACROS -DUSE_X86_64_ASM

Enclosed you can see some VirtualGL internal performance tracing:

64-bit:

Polygons in scene: 62464
[VGL] Shared memory segment ID for vglconfig: 6848522
[VGL] VirtualGL v2.2.90 64-bit (Build 20110813)
[VGL] Opening local display :0
[VGL] NOTICE: Replacing dlopen("libGL.so.1") with dlopen("librrfaker.so")
[VGL] NOTICE: Replacing dlopen("libGL.so.1") with dlopen("librrfaker.so")
Visual ID of window: 0x21
Context is Direct
OpenGL Renderer: Gallium 0.4 on AMD BARTS
[VGL] Using synchronous readback (GL format = 0x80e1)
Readback    -   68.18 Mpixels/sec-   61.10 fps
Blit        -  307.79 Mpixels/sec-  275.80 fps
Total       -   55.67 Mpixels/sec-   49.88 fps

32-bit:

Polygons in scene: 62464
[VGL] Shared memory segment ID for vglconfig: 6946826
[VGL] VirtualGL v2.2.90 32-bit (Build 20110815)
[VGL] Opening local display :0
[VGL] NOTICE: Replacing dlopen("libGL.so.1") with dlopen("librrfaker.so")
[VGL] NOTICE: Replacing dlopen("libGL.so.1") with dlopen("librrfaker.so")
Visual ID of window: 0x21
Context is Direct
OpenGL Renderer: Gallium 0.4 on AMD BARTS
[VGL] Using synchronous readback (GL format = 0x80e1)
Readback    -   33.80 Mpixels/sec-   30.29 fps
Blit        -  307.46 Mpixels/sec-  275.51 fps
Total       -   30.44 Mpixels/sec-   27.27 fps


The VirtualGL developer says the slow readback performane in 32-bit mode is out of his scope and driver related.

Regards,

Ingo
 
 
 


More information about the mesa-dev mailing list