Some notes on optimization work in progress (was: Re: [cairo]
WinXp benchmarks)
David Reveman
davidr at novell.com
Thu Mar 3 12:59:12 PST 2005
On Thu, 2005-03-03 at 21:02 +0100, Soeren Sandmann wrote:
> Carl Worth <cworth at redhat.com> writes:
>
> > Here's an update on where that work stands. First, I've chosen
> > gearflowers.svg[*] as a profile image. It's a rather complex image
> > with lots of splines, a mixture of strokes and fills, and a *lot* of
> > gradients.
> >
> > Rendering this image with current cairo takes about 5-7 seconds on my
> > laptop. Here's how that breaks down under oprofile (using a slightly
> > modified version of svg2png that produces no output PNG file):
> >
> > CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> > Profiling through timer interrupt
> > samples % app name symbol name
> > 1371 31.9879 libpixman.so.1.0.0 fbRasterizeEdges8
> > 788 18.3854 libcairo.so.1.0.0 _cairo_pattern_calc_color_at_pixel
> > 481 11.2226 libpixman.so.1.0.0 IcCombineOverU
> > 356 8.3061 libcairo.so.1.0.0 _cairo_pattern_begin_draw
> > 126 2.9398 libcairo.so.1.0.0 _cairo_pattern_shader_linear
> > 108 2.5198 libpixman.so.1.0.0 IcStepOver
> > 102 2.3798 libpixman.so.1.0.0 pixman_compositeGeneral
> > 89 2.0765 libpixman.so.1.0.0 IcFetch_a8
> > 83 1.9365 libpixman.so.1.0.0 IcOver
> > 79 1.8432 libpixman.so.1.0.0 IcCombineMaskU
> > ... [http://cairographics.org/~cworth/images/gearflowers.oprofile]
> >
> > So, the rasterization is topping the list, followed closely by
> > gradient computation, and then compositing. It'd be nicer to get some
> > callgraph-based sums to better estimate those things, but the prime
> > candidates for optimization are obvious enough.
>
> With the sysprof profiler, which does do callgraph-based sums, I get
> these results:
>
> _cairo_pattern_calc_color_at_pixel() 36.86 %
> pixman_composite() 22.90 %
> (with 17.16% of those in pixman_CompositeGeneral)
> fbRasterizeTrapezoid() 16.17 %
>
> The percentages are totals, ie. they include children of the
> functions. The rasterization times reported by the two profilers are
> quite different.
Either way it seems like the gradient calculations are quite expensive.
The first thing we should do is check that no larger gradients than
necessary are created, after the recent changes that made so that
patterns are passed to the backends, I'm no longer sure that the size is
optimal. The second thing we could do is to hook up simple optimizations
for vertical and horizontal gradients as Owen suggested recently.
Looking at gearflowers.svg and SVGs in general, it seems that most
patterns are solid or gradients and that should always end in this
composite operation:
SRC(argb32, no transform) in MASK(a8 shape, no transform) op
DST(probably ARGB32 or 32bpp RGB24)
We should be able to accelerate that pretty well, right?
>
> I am using CVS HEAD of cairo and libpixman. The modification I made to
> svg2png is bascially this:
>
> - cairo_set_target_png (cr, png_file, CAIRO_FORMAT_ARGB32, width,
> height);
> + cairo_set_target_image (cr, data, CAIRO_FORMAT_ARGB32, width,
> height, width);
-David
More information about the cairo
mailing list