Compression branch ready

Zack Rusin zack at kde.org
Thu Aug 25 09:58:13 PDT 2011


On Thursday, August 25, 2011 10:12:51 AM José Fonseca wrote:
> On Wed, Aug 24, 2011 at 2:25 AM, Zack Rusin <zack at kde.org> wrote:
> > Hey,
> > 
> > I think the compression branch is ready and it'd be great if people could
> > give it a spin. I'd like to finally merge it sometime this week.
> > 
> > It makes tracing about 10x faster which is a pretty big win. In that
> > branch we've switched from gzip to snappy
> > (http://code.google.com/p/snappy/). The trace files will be a little
> > bigger though but I think that ~10x speed improvement is worth it.
> > 
> > z
> 
> Zack,
> 
> I've benchmarked with ipers mesademo (which seems a very good test
> case of tracing overhead), and framerate increased 5x by avoiding
> gzflush (already in master), and further 2x by switching to snappy.
> Pretty amazing!

Actually I think that example is our worst case scenario because iirc that's 
the one that's CPU bound and every CPU bound app will suffer immensely if you 
reduce the cpu time it gets even a little. I think for this example and others 
like it, it's worth to revisit the threaded-trace branch but with a different 
model.
Currently the threaded trace branch creates another thread which does 
compression and disk writes, there's a ringbuffer to which the rendering thread 
writes the data, then the writer thread compresses and writes it to disk. That 
Implies that we either wait for the compression or disk io so the rendering 
thread ends up waiting for the writer thread a lot.

I think we need two extra threads:
- compression thread - the rendering thread writes the data to a ringbuffer, 
the compression thread compresses it and sticks the compressed data in another 
ringbuffer
- disk io thread - which reads from the compressed data ringbuffer and writes 
it to disk.

With snappy being able to compress at about 250mb/s and sata3 ssd drives being 
able to write even faster we should be able to keep up without stalls on the 
rendering thread with any app that produces less than 250mb of data per 
second. 

I think that should reduce the overhead in cpu bound apps like ipers to 
minimum. 

At the moment though, at least imho, by far the biggest problem we need to 
solve is loading of large traces in the GUI. 
Other things would be nice, but this is basically the one thing I can't live 
without :)

> I'm happy for you to merge anytime now.

Jose, thanks a lot for your work on signals/exception catching in that 
branch!!! Are you happy with the signals handling code or do you think we 
should integrate something like the libgg cleanup handling code?
If you're happy with it as it is, I'll merge the branch tonight. If not I'll 
integrate the libgg code and merge it then.

z


More information about the apitrace mailing list