[poppler] [patch] improving poppler performance
Paweł Wiejacha
pawel.wiejacha at gmail.com
Mon Aug 23 15:54:21 PDT 2010
Hi,
as a part of 'Optimization of Open Source programs' course at my
university I wrote some patches which should improve poppler performance.
I got 4-38% (avg. 13%) performance increase [1] which for example at my
configuration (2 GHz Core2 Duo) sometimes reduces single slide rendering
time by 1.2 second.
Some patches introduce higher memory consumption (about 9%) and make .so
file bigger (about 150KB) but this is controlled by set of #defines.
The results of my benchmarks can be found at:
http://students.mimuw.edu.pl/~pw248348/opos/plots-tests-0_base_version_vs_12_splashxpath_modern_sort.pdf
To improve and test poppler performance, I collected several 'random'
PDFs and measured single page rendering time.
My test framework (bash + R scripts) can be found at:
http://students.mimuw.edu.pl/~pw248348/opos/perf_tests.tar.gz so you can
run tests on different set of PDFs.
The framework simply compares two git refs and produces CPU time and
memory usage plots. See README for details.
Below is the list of patches:
01_get_whole_rgb_line.patch
[GfxICCBasedColorSpace, SplashOutputDev] avoid unnecessary lcms calls
03_cal_gray_reuse_pow_computations.patch
GCC does not reuse subsequent pow() calls with the same arguments.
BTW I was considering fast but less accurate pow() implementation but I
don't know how how accurate it should be for color conversion and
anti-aliasing.
04_fast_indexed_dict_lookup.patch
Some PDFs contain large amounts (2000+) of Objects, e.g. publications
with plots (grid, samples, lines, bands, axes, legend, scales). This
results in 2000^2 strcmp() calls.
Now if Dict contains > 10 objects, object-key index is built to reduce
lookup time.
05_bigger_glyph_pixmap_cache.patch
Previously glyph cache hit rate was about 10-20%. Increasing its size 6
times (up to 1.5 MB additional memory) results in 30-60% hit rate and
3-12% performance increase.
06_splash_forced_inlines.patch
Sometimes GCC refuses to inline function marked as inline. IMO Splash
pipe has to be inlined so I forced gcc to do that.
07_inline_lexer_calls.patch
08_inline_parser_calls.patch
11_flatten_lexer_and_parser.patch
Both lexer and parser calls should be inlined to avoid millions of
unnecessary calls and allow compiler to make further optimizations.
(*.so size +12KB, CPU time -8%)
09_misc_goo_optimizations.patch
some small GooHash and GooString optimizations
10_stream_inlines.patch
CCITTFaxStream and FlateStream inlines
12_splashxpath_modern_sort
using qsort() from stdlib is rather bad idea
You can find them at: http://students.mimuw.edu.pl/~pw248348/opos/patches/
Best regards,
Paweł Wiejacha.
[1] '50% performance increase' means: 50 = 100 *
(old_cpu_time/new_cpu_time - 1)
More information about the poppler
mailing list