[RFC] [PATCH 1/2] Add a basic support of logging cpu and gpu time

Thu Aug 25 13:18:22 PDT 2011

On Wed, Aug 24, 2011 at 5:15 PM, Zack Rusin <zack at kde.org> wrote:

> On Wednesday, August 24, 2011 03:28:00 PM Chris Fester wrote:
> >
> I think that computing timings during tracing could be useful if your
> application is CPU bound - huge discrepency between tracing and retracing
> timings would mean that the app is spending too much time creating the
> data.
> And the biggest offsets would point you to the places in your app that are
> the
> bottlenecks.
>
> Other than this case I'm not sure if computing the timings while tracing is
> particular useful, because for all other case it should be a lot better to
> do
> it while retracing.
>

The problem I've got is I have the same rendering app, same code base, same
OS, etc.  I upgrade the ATI kernel module and library stack.  Now I have
very specific reproducible scenarios where my CPU usage shoots up for a
certain range of frames, appearing to cause on-screen rendering to slow
down/stutter.  The CPU usage is mostly accumulated in the rendering process,
although one of my top iterations shows that X.org is also chewing up time
(top -d 0.1).  Our rendering process is using glXSwapIntervalSGI(1), it's
framerate is more-or-less clamped to VSYNC at 60Hz.  The rendering expert
guy here suspects that something has changed in the ATI libs/drivers WRT
loading textures onto the card.  Unfortunately the problem only seems to
apply to certain textures.

Soooo.... it is possible that there's some sort of interaction between the
rendering process and ATI's libGL that truly is more of a problem with our
rendering process.  But I do have to prove for sure that the CPU time of
each gl call is about the same (compared to the older ATI libs).  I agree
with you that *most* GL calls wind up queuing up a command in the card's
command buffer, and that won't take much CPU.  I also agree that it will be
useful to prove that GPU time between driver versions isn't changing.

I like how Yuanhan's implementation times specific functions as opposed to
everything.  I may do some initial timing with my variant first, then "drill
down" with Yuanhan's.

Thanks!
Chris

-- 
Oh, meltdown... It's one of these annoying buzzwords. We prefer to call it
an unrequested fission surplus.
-- Mr. Burns, The Simpsons
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/apitrace/attachments/20110825/d460ef84/attachment.html>