[cairo] ps optimization

Adrian Johnson ajohnson at redneon.com
Sun Mar 9 05:24:25 PDT 2008


Carl Worth wrote:
> On Fri, 07 Mar 2008 18:53:55 -0800, Charles Doutriaux wrote:
> I've posted the C code (uncompressed this time---roughly 7 MB) here:
> 
> 	http://cairographics.org/~cworth/tmp/dump_cairo.c
> 
>> The png output for this code, takes around a second.
>>
>> The postscript output takes around 30 minutes...
>>
>> The ps generated is 19MB!
> 
> Hmm... I'm not replicating those results. Here's what I see:
> 
>     $ gcc $(pkg-config --cflags --libs cairo) -o dump_cairo dump_cairo.c
>     $ time ./dump_cairo
>     real    2m4.201s
>     user    1m15.120s
>     sys     0m0.137s
>     $ ls -l test.ps
>     -rw-r--r-- 1 cworth cworth 3149850 2008-03-08 08:54 test.ps
> 
> So on my laptop it's only taking 2 minutes and only resulting in a
> file of 3MB. This is with cairo recently from git, but glancing
> through the logs since 1.5.12 I certainly don't see anything obvious
> that would have changed the cairo-ps performance, (in either speed or
> output size).

I am getting similar results.

Whenever there are size or performance problems with the PS/PDF backends
the cause is almost always the use of fallback images.

I've committed a change which adds a comment above each fallback image
with the size and location of the fallback image in cairo device
coordinates. The size of these comments are insignificant compared to
the size of the fallback images. However it allows cairo users to
identify where fallbacks are used and optimize their use of the cairo
API to reduce the use of fallbacks. Now that almost all unnecessary
fallbacks have been eliminated in the PS backends, the remaining
fallbacks can only be eliminated by avoiding patterns and operations
unsupported by PS.

> Or are the differences I'm seeing entirely explained because the
> example you sent only has 1 instead of 6 figures? A factor of 6 would
> capture most of the difference.

That would appear to explain the size difference.

Looking at the fallbacks used:

$ grep Fallback test.ps | wc -l
1684
$ grep Fallback test_full_precision.ps | head
% Fallback Image: x=69, y=352, w=11, h=1 res=300dpi size=750
% Fallback Image: x=85, y=352, w=3, h=1 res=300dpi size=195
% Fallback Image: x=94, y=352, w=8, h=1 res=300dpi size=510
% Fallback Image: x=108, y=352, w=3, h=1 res=300dpi size=195
% Fallback Image: x=114, y=352, w=15, h=1 res=300dpi size=945
% Fallback Image: x=140, y=352, w=5, h=1 res=300dpi size=315
% Fallback Image: x=149, y=352, w=3, h=1 res=300dpi size=195
% Fallback Image: x=154, y=352, w=8, h=1 res=300dpi size=510
% Fallback Image: x=188, y=352, w=5, h=1 res=300dpi size=375
% Fallback Image: x=200, y=352, w=17, h=1 res=300dpi size=1065

There are a large number of tiny fallback images. The fallback images
account for 55% of the file size. The problem with such a large number
of fallback images is that all of the drawing operations for the page
are replayed to each fallback image. And with 7MB of cairo drawing code
and 1684 fallback images this is going to take some time.

With n drawing operations and m fallback images, the time required to
draw the fallbacks is n*m. With six figures instead of one this becomes
(6*m)*(6*n). This would explain why the above numbers show that 6
figures takes about 30 times longer than one figure.

>> Somebody on this list already suggested once to use 24bit pattern.
>> Unfortunately as this example shows it is not an option for us because
>> we need the transparency thru these patterns...

Looking at the C file this is the cause of the fallback images. All your
colors are opaque. However there is a lot of code similar to the following:

    image = cairo_image_surface_create(CAIRO_FORMAT_ARGB32, 4, 4);
    cr2 = cairo_create(image);
    pattern = cairo_get_source(cr);
    cairo_set_source(cr2, pattern);
    cairo_set_line_width(cr2, 1);
    cairo_move_to(cr2, 4, 0);
    cairo_line_to(cr2, 0, 4);
    cairo_stroke(cr2);
    cairo_move_to(cr2, 4, -4);
    cairo_line_to(cr2, -4, 4);
    cairo_stroke(cr2);
    cairo_move_to(cr2, 8, 0);
    cairo_line_to(cr2, 0, 8);
    cairo_stroke(cr2);
    pattern = cairo_pattern_create_for_surface(image);
    cairo_pattern_set_extend(pattern, CAIRO_EXTEND_REPEAT);
    cairo_set_source(cr, pattern);
    cairo_move_to(cr, 55.799999, 216.222702);
    cairo_line_to(cr, 57.394691, 216.222702);
    cairo_line_to(cr, 57.394691, 213.940277);
    cairo_line_to(cr, 55.799999, 213.940277);
    cairo_line_to(cr, 55.799999, 216.222702);
    cairo_fill(cr);
    cairo_set_source_rgb(cr, 0.317647, 0.317647, 0.317647);
    cairo_surface_destroy(image)

The cairo_image_surface_create(CAIRO_FORMAT_ARGB32, 4, 4) is the cause
of the fallback images. If you replace this with
cairo_surface_create_similar(surface, CAIRO_CONTENT_COLOR_ALPHA, 4, 4)
the drawing operations used on the similar surface will be recorded to a
meta-surface. When the similar surface is used as the source to draw on
the page, the meta-surface will be replayed to the PS surface resulting
in all vector PS output.

After replacing cairo_image_surface_create() with
cairo_surface_create_similar() I get:

$ time ./dump_cairo

real    0m1.576s
user    0m1.272s
sys     0m0.056s
$ ls -l test.ps
-rw-r--r-- 1 ajohnson users 4705932 2008-03-09 20:34 test.ps

Eliminating the fallbacks has solved the performance problem however it
has increased the file size. This is because there are more than 4000 of
 these surfaces created in the dump_cairo.c file. The combined total
size of all these PS procedures each with a few drawing operations
exceeds the size of the fallback images.

Looking at code for some of these image surfaces, they all seem to be
the same. If the PS backend would emit the procedure once and re-use it
each time it draws the pattern the file size could be reduced to about
1.5MB. The PDF backend has similar problems with re-embedding the same
pattern every time it is used. There is work planned to eliminate this
duplication in the PDF backend. Some optimization could also be done in
the PS backend. However for this sort optimization to be effective you
need to create the pattern once and re-use it each time.

Some other problems I can see are:

    cairo_move_to(cr2, 4, -4);
    cairo_line_to(cr2, -4, 4);
    cairo_stroke(cr2);
    cairo_move_to(cr2, 8, 0);
    cairo_line_to(cr2, 0, 8);
    cairo_stroke(cr2);

These lines are outside of the image surface. Eliminating them would
reduce the processing time.

    cairo_move_to(cr, 55.799999, 216.222702);
    cairo_line_to(cr, 57.394691, 216.222702);
    cairo_line_to(cr, 57.394691, 213.940277);
    cairo_line_to(cr, 55.799999, 213.940277);
    cairo_line_to(cr, 55.799999, 216.222702);
    cairo_fill(cr);

The last cairo_line_to() should be replaced with cairo_close_path().
Even better would be to use cairo_rectangle(). Then the PS backend will
use a rectangle operator instead of drawing each line which reduce the
output size.

    achar[0] = 'D';
    achar[1] = '\0';
    cairo_rel_move_to(cr, 0.000000, 0.000000);
    cairo_show_text(cr, achar);
    achar[0] = 'o';
    achar[1] = '\0';
    cairo_rel_move_to(cr, 0.000000, 0.000000);
    cairo_show_text(cr, achar);
    achar[0] = 'u';
    achar[1] = '\0';
    cairo_rel_move_to(cr, 0.000000, 0.000000);
    cairo_show_text(cr, achar);
    achar[0] = 'b';
    achar[1] = '\0';
    cairo_rel_move_to(cr, 0.000000, 0.000000);
    cairo_show_text(cr, achar);
    achar[0] = 'l';
    achar[1] = '\0';
    cairo_rel_move_to(cr, 0.000000, 0.000000);
    cairo_show_text(cr, achar);
    achar[0] = 'e';
    achar[1] = '\0';
    cairo_rel_move_to(cr, 0.000000, 0.000000);
    cairo_show_text(cr, achar);

Drawing one character at a time is very inefficient. Each call to
cairo_show_text() will generate PS output to select the pattern or
color, select the font, set the font matrix, and draw the text.

cairo_show_text() is the toy text API. It is really only intended for
writing tests and sample code for tutorials. The better solution is to
use pango to layout your text.

>> The time is somewhat an issue, but also the size of the ps generated...

Now that we have PDF files for storing and distributing documents, PS
files are really only used for printing. The time it takes to generate
and print a PS file is generally more important that the actual size of
the PS file.

>> I think the ps driver still needs some optimization?

There are a number of optimizations planned for the PS backend. I
estimate that it should be possible to get the PS output from
dump_cairo.c down to about 1MB.

> Quite likely. There's almost always more we can do. For example a
> simple thing that might pay off is to tweak the printing of
> floating-point number to not emit excessive digits, (I know this is
> something that Adrian is planning on doing).

Now that we are using 24.8 fixed point the number of decimal places can
be reduced. Testing dump_cairo.c with 3 decimal places instead of 6
reduced the file size of test.ps by about 10%. This is with a file that
contains a huge number of paths. Most PS files are not going to see that
much improvement.

I have not done this change yet as I need to work out how the precision
should be controlled when emitting strokes. Strokes are emitted in user
space coordinates. If doubles are rounded to 3 decimal places and a
stroke is emitted after doing cairo_scale (cr, 1000, 1000), the path
coordinates would be rounded to 72dpi.

>> Thanks for any suggestion,

You code appears to be drawing a very large number of colored
rectangles. Looking at the output a there are a lot of contiguous areas
with the same color. If the application merged these areas into one fill
for each color a large reduction in the PS file size could be achieved.
Even just merging consecutive rectangles of the same color on each line
would help.



More information about the cairo mailing list