[poppler] [patch] improving poppler performance

Mon Aug 23 15:54:21 PDT 2010

Hi,

as a part of 'Optimization of Open Source programs' course at my 
university I wrote some patches which should improve poppler performance.

I got 4-38% (avg. 13%) performance increase [1] which for example at my 
configuration (2 GHz Core2 Duo) sometimes reduces single slide rendering 
time by 1.2 second.
Some patches introduce higher memory consumption (about 9%) and make .so 
file bigger (about 150KB) but this is controlled by set of #defines.

The results of my benchmarks can be found at: 
http://students.mimuw.edu.pl/~pw248348/opos/plots-tests-0_base_version_vs_12_splashxpath_modern_sort.pdf

To improve and test poppler performance, I collected several 'random' 
PDFs and measured single page rendering time.
My test framework (bash + R scripts) can be found at: 
http://students.mimuw.edu.pl/~pw248348/opos/perf_tests.tar.gz so you can 
run tests on different set of PDFs.
The framework simply compares two git refs and produces CPU time and 
memory usage plots. See README for details.

Below is the list of patches:

01_get_whole_rgb_line.patch
[GfxICCBasedColorSpace, SplashOutputDev] avoid unnecessary lcms calls

03_cal_gray_reuse_pow_computations.patch
GCC does not reuse subsequent pow() calls with the same arguments.
BTW I was considering fast but less accurate pow() implementation but I 
don't know how how accurate it should be for color conversion and 
anti-aliasing.

04_fast_indexed_dict_lookup.patch
Some PDFs contain large amounts (2000+) of Objects, e.g. publications 
with plots (grid, samples, lines, bands, axes, legend, scales). This 
results in 2000^2 strcmp() calls.
Now if Dict contains > 10 objects, object-key index is built to reduce 
lookup time.

05_bigger_glyph_pixmap_cache.patch
Previously glyph cache hit rate was about 10-20%. Increasing its size 6 
times (up to 1.5 MB additional memory) results in 30-60% hit rate and 
3-12% performance increase.

06_splash_forced_inlines.patch
Sometimes GCC refuses to inline function marked as inline. IMO Splash 
pipe has to be inlined so I forced gcc to do that.

07_inline_lexer_calls.patch
08_inline_parser_calls.patch
11_flatten_lexer_and_parser.patch
Both lexer and parser calls should be inlined to avoid millions of 
unnecessary calls and allow compiler to make further optimizations. 
(*.so size +12KB, CPU time -8%)

09_misc_goo_optimizations.patch
some small GooHash and GooString optimizations

10_stream_inlines.patch
CCITTFaxStream and FlateStream inlines

12_splashxpath_modern_sort
using qsort() from stdlib is rather bad idea

You can find them at: http://students.mimuw.edu.pl/~pw248348/opos/patches/

Best regards,
Paweł Wiejacha.

[1] '50% performance increase' means: 50 = 100 * 
(old_cpu_time/new_cpu_time - 1)