[PATCH] Per-call profiling support
Ryan C. Gordon
icculus at icculus.org
Sun Jan 8 22:04:38 PST 2012
I added profiling to glretrace, to fix a performance issue in a game I'm
The gist is that you make a trace as normal, but you run glretrace with
the -p option...it'll re-run the GL calls as usual, but it'll note how
long each call took and write a log. This has the benefit of being
pretty precise for measuring GL performance per-call, and also removes
all the overhead of the game, so you're measuring how fast you can push
In that game with the performance issue, looking at a slow frame, within
seconds, I knew where the problem was...
507216 [1.708 usec] glBindBufferARB(target = GL_ARRAY_BUFFER, buffer = 3)
507217 [3431.36 usec] glMapBufferARB(target = GL_ARRAY_BUFFER, access =
GL_READ_WRITE) = 0xc8406400
507218 [1.913 usec] glBindBufferARB(target = GL_ARRAY_BUFFER, buffer = 3)
507219 [639.253 usec] memcpy(dest = 0xc8406400, src = blob(3120000), n =
507220 [1.994 usec] glUnmapBufferARB(target = GL_ARRAY_BUFFER) = true
...a buffer mapping/write/unmap that takes 4+ milliseconds?! (and many
more like this over the frame) ... I swapped that out with a
GL_ARB_map_buffer_range approach, and no more bottleneck!
Without this, I couldn't find a single tool on Linux that could do
decent OpenGL profiling, but spending an hour adding this to ApiTrace
changed my strategy from psychic debugging to immediate diagnosis and
correction. Once I could profile the problem, it took five minutes to
fix, which felt great.
I've attached the apitrace patch; this was my first shot at it. Even if
the idea makes sense to everyone, it probably needs at least some modest
cleanups, so comments and criticism are appreciated.
More information about the apitrace