[cairo] Survey of polygon rasterization techniques

Kiia Kallio kkallio at uiah.fi
Fri Aug 10 12:28:33 PDT 2007


David,

This is really great work that you ported all these algorithms into one framework.

I haven't had time to go through the implementation properly yet, but it seems that
the edge plotting loop is much more inefficient than the one I have implemented.
I profiled this, and the implementation seems to consume 25% of time in the edge
plotting.

There are for instance compares for the x coordinate in the inner loop. If the
clipping is properly implemented, these are not needed. Also, the plotting is done
to 32-bit chunks with masking, while it can be implemented directly to char
data much more efficiently. There is also some additional indexing math not
present in the original implementation, and the code doesn't use immediate
data for offsets but fetches them from an array.

There seems to be also some issues with clipping, as the program crashes with
some configurations (especially if dumping to files is in use) due to memory
trashing.

I will investigate this more and send you a patch when I have time... This will be
sometime next week earliest though.

Kiia

>>> "David Turner" <david at freetype.org> 08/08/07 7:28 PM >>>
Thanks Philipp,

I spotted and fixed the problem (a stupid typo), and this quite changes the performance
of the "tor7" algorithm (slowing it down, but making it correct). I modified the program
to dump the content of rasterization in a large .pgm image file for each rasterizer, and
this helped me spot many other problems that I hadn't to deal when using the rasterizer
in ft2demos. Mostly were related to horizontal clipping. I also found a few optimizations
to write "tor8", which is mostly the same than "tor7" with loop unrolling in a few critical parts.

the new sources are at:

  http://david.freetype.org/rasterizer-shootout/raster-comparison-20070808.tar.bz2

you can now use the "-d" flag to dump rasterization images in the current directory
(warning, these will be *large* .pgm files).

or specify a given test with the "-t" option (useful for profiling)

here are the new benchmarks (omitting most non-optimized versions now):

  "ftrasters -r50 -s20 /usr/share/fonts/truetype/msttcorefonts/Times_New_Roman.ttf"

    loading 1320 glyphs  = 0.030000 s
    rasterizer                  time        throughput             global  relative
    ================================================================================
    with dummy                :    0.200 s  (  330000 glyphs/s)
    with analytical           :    0.770 s  (   85714 glyphs/s)    x 1.00  x 1.00
    with mono1                :    0.440 s  (  150000 glyphs/s)    x 0.57  x 0.42
    with tor15x17_8           :    1.550 s  (   42580 glyphs/s)    x 2.01  x 2.37
    with tor16x16_8           :    1.590 s  (   41509 glyphs/s)    x 2.06  x 2.44
    with tor16x16_7           :    1.670 s  (   39520 glyphs/s)    x 2.17  x 2.58
    with kiia32_7             :    1.550 s  (   42580 glyphs/s)    x 2.01  x 2.37
    with tor8x32_8            :    1.180 s  (   55932 glyphs/s)    x 1.53  x 1.72
    with tor8x8_8             :    1.170 s  (   56410 glyphs/s)    x 1.52  x 1.70
    with kiia16_7             :    1.190 s  (   55462 glyphs/s)    x 1.55  x 1.74
    with kiia8_7              :    0.910 s  (   72527 glyphs/s)    x 1.18  x 1.25

  "ftrasters -r5 -s600 /usr/share/fonts/truetype/msttcorefonts/Arial.ttf"

    loading 1320 glyphs  = 0.030000 s
    rasterizer                  time        throughput             global  relative
    ================================================================================
    with dummy                :    0.070 s  (   94285 glyphs/s)
    with analytical           :    1.460 s  (    4520 glyphs/s)    x 1.00  x 1.00
    with mono1                :    0.690 s  (    9565 glyphs/s)    x 0.47  x 0.45
    with tor15x17_8           :    4.170 s  (    1582 glyphs/s)    x 2.86  x 2.95
    with tor16x16_8           :    4.810 s  (    1372 glyphs/s)    x 3.29  x 3.41
    with tor16x16_7           :    5.330 s  (    1238 glyphs/s)    x 3.65  x 3.78
    with kiia32_7             :    3.300 s  (    2000 glyphs/s)    x 2.26  x 2.32
    with tor8x32_8            :    3.800 s  (    1736 glyphs/s)    x 2.60  x 2.68
    with tor8x8_8             :    3.710 s  (    1778 glyphs/s)    x 2.54  x 2.62
    with kiia16_7             :    2.400 s  (    2750 glyphs/s)    x 1.64  x 1.68
    with kiia8_7              :    1.860 s  (    3548 glyphs/s)    x 1.27  x 1.29

  * as you can see the high-quality Kiia and Tor algorithms are now pretty equivalent
    for small sizes, while Kiaa's multi-sampler takes the lead for large ones. 

  * for low-quality/speed, Kiaa's multi-sampler takes an even larger lead, regardless
    of the size of the glyphs.

I'm still scratching my head to find other variants or optimizations.

thanks a lot

- David


On Wed, 08 Aug 2007 12:25:28 +0200, "Philipp Heise" <heise at in.tum.de> said:
> David Turner wrote:
> >   ...
> >   if you feel like it, feel free to look at the code, run it, and improve it, or ask
> >   questions in this thread.
> > 
> >   Hope this helps,
> > 
> 
> Hi David,
> 
> your program to compare the performance of various rasterization
> algorithms is really great ... Thank you!
> 
> Since there was no output for the resulting glyphs and I wanted to see
> some results  - I put the following hack in your ConverRaster() after
> raster_render() :
> 
>   {
>    FILE* f;
>    int i;
>    f=fopen("out.pgm","wb");
>    fprintf(f,"P2\n%d %d\n255\n",width,height);
>    for(i=0;i<width*height;i++)
>      fprintf(f,"%d\n",bitmap->buffer[i]);
>    fclose(f);
>    printf("press enter to continue\n");
>    getchar();
>   }
> 
> 
> The results from the rasterizer of the torXXX_7-variant seem to be wrong
> for most glyphs (tested with Bitstream Vera)...
> 
> Thanks Philipp
_______________________________________________
cairo mailing list
cairo at cairographics.org
http://lists.cairographics.org/mailman/listinfo/cairo




More information about the cairo mailing list