[PATCH] Per-call profiling support

Ryan C. Gordon icculus at icculus.org
Sun Jan 8 22:04:38 PST 2012


I added profiling to glretrace, to fix a performance issue in a game I'm 
working on.

The gist is that you make a trace as normal, but you run glretrace with 
the -p option...it'll re-run the GL calls as usual, but it'll note how 
long each call took and write a log. This has the benefit of being 
pretty precise for measuring GL performance per-call, and also removes 
all the overhead of the game, so you're measuring how fast you can push 
the GPU.

In that game with the performance issue, looking at a slow frame, within 
seconds, I knew where the problem was...

507216 [1.708 usec] glBindBufferARB(target = GL_ARRAY_BUFFER, buffer = 3)
507217 [3431.36 usec] glMapBufferARB(target = GL_ARRAY_BUFFER, access = 
GL_READ_WRITE) = 0xc8406400
507218 [1.913 usec] glBindBufferARB(target = GL_ARRAY_BUFFER, buffer = 3)
507219 [639.253 usec] memcpy(dest = 0xc8406400, src = blob(3120000), n = 
3120000)
507220 [1.994 usec] glUnmapBufferARB(target = GL_ARRAY_BUFFER) = true

...a buffer mapping/write/unmap that takes 4+ milliseconds?! (and many 
more like this over the frame) ... I swapped that out with a 
GL_ARB_map_buffer_range approach, and no more bottleneck!

Without this, I couldn't find a single tool on Linux that could do 
decent OpenGL profiling, but spending an hour adding this to ApiTrace 
changed my strategy from psychic debugging to immediate diagnosis and 
correction. Once I could profile the problem, it took five minutes to 
fix, which felt great.

I've attached the apitrace patch; this was my first shot at it. Even if 
the idea makes sense to everyone, it probably needs at least some modest 
cleanups, so comments and criticism are appreciated.

Thanks,
--ryan.



More information about the apitrace mailing list