[PATCH] Per-call profiling support
Ryan C. Gordon
icculus at icculus.org
Sun Jan 8 22:04:38 PST 2012
I added profiling to glretrace, to fix a performance issue in a game I'm
working on.
The gist is that you make a trace as normal, but you run glretrace with
the -p option...it'll re-run the GL calls as usual, but it'll note how
long each call took and write a log. This has the benefit of being
pretty precise for measuring GL performance per-call, and also removes
all the overhead of the game, so you're measuring how fast you can push
the GPU.
In that game with the performance issue, looking at a slow frame, within
seconds, I knew where the problem was...
507216 [1.708 usec] glBindBufferARB(target = GL_ARRAY_BUFFER, buffer = 3)
507217 [3431.36 usec] glMapBufferARB(target = GL_ARRAY_BUFFER, access =
GL_READ_WRITE) = 0xc8406400
507218 [1.913 usec] glBindBufferARB(target = GL_ARRAY_BUFFER, buffer = 3)
507219 [639.253 usec] memcpy(dest = 0xc8406400, src = blob(3120000), n =
3120000)
507220 [1.994 usec] glUnmapBufferARB(target = GL_ARRAY_BUFFER) = true
...a buffer mapping/write/unmap that takes 4+ milliseconds?! (and many
more like this over the frame) ... I swapped that out with a
GL_ARB_map_buffer_range approach, and no more bottleneck!
Without this, I couldn't find a single tool on Linux that could do
decent OpenGL profiling, but spending an hour adding this to ApiTrace
changed my strategy from psychic debugging to immediate diagnosis and
correction. Once I could profile the problem, it took five minutes to
fix, which felt great.
I've attached the apitrace patch; this was my first shot at it. Even if
the idea makes sense to everyone, it probably needs at least some modest
cleanups, so comments and criticism are appreciated.
Thanks,
--ryan.
More information about the apitrace
mailing list