<div class="gmail_quote">On Wed, Aug 24, 2011 at 5:15 PM, Zack Rusin <<a href="mailto:zack@kde.org">zack@kde.org</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"> <div class="im">On Wednesday, August 24, 2011 03:28:00 PM Chris Fester wrote: > </div> I think that computing timings during tracing could be useful if your application is CPU bound - huge discrepency between tracing and retracing timings would mean that the app is spending too much time creating the data. And the biggest offsets would point you to the places in your app that are the bottlenecks. Other than this case I'm not sure if computing the timings while tracing is particular useful, because for all other case it should be a lot better to do it while retracing. </blockquote><div> The problem I've got is I have the same rendering app, same code base, same OS, etc. I upgrade the ATI kernel module and library stack. Now I have very specific reproducible scenarios where my CPU usage shoots up for a certain range of frames, appearing to cause on-screen rendering to slow down/stutter. The CPU usage is mostly accumulated in the rendering process, although one of my top iterations shows that X.org is also chewing up time (top -d 0.1). Our rendering process is using glXSwapIntervalSGI(1), it's framerate is more-or-less clamped to VSYNC at 60Hz. The rendering expert guy here suspects that something has changed in the ATI libs/drivers WRT loading textures onto the card. Unfortunately the problem only seems to apply to certain textures. Soooo.... it is possible that there's some sort of interaction between the rendering process and ATI's libGL that truly is more of a problem with our rendering process. But I do have to prove for sure that the CPU time of each gl call is about the same (compared to the older ATI libs). I agree with you that *most* GL calls wind up queuing up a command in the card's command buffer, and that won't take much CPU. I also agree that it will be useful to prove that GPU time between driver versions isn't changing. I like how Yuanhan's implementation times specific functions as opposed to everything. I may do some initial timing with my variant first, then "drill down" with Yuanhan's. Thanks! Chris </div></div> -- Oh, meltdown... It's one of these annoying buzzwords. We prefer to call it an unrequested fission surplus. -- Mr. Burns, The Simpsons