[poppler] [patch] improving poppler performance

Albert Astals Cid aacid at kde.org
Tue Aug 24 15:56:15 PDT 2010


A Dilluns, 23 d'agost de 2010, Paweł Wiejacha va escriure:
> Hi,

Hi

> as a part of 'Optimization of Open Source programs' course at my
> university I wrote some patches which should improve poppler performance.
> 
> I got 4-38% (avg. 13%) performance increase [1] which for example at my
> configuration (2 GHz Core2 Duo) sometimes reduces single slide rendering
> time by 1.2 second.
> Some patches introduce higher memory consumption (about 9%) and make .so
> file bigger (about 150KB) but this is controlled by set of #defines.
> 
> The results of my benchmarks can be found at:
> http://students.mimuw.edu.pl/~pw248348/opos/plots-tests-0_base_version_vs_1
> 2_splashxpath_modern_sort.pdf
> 
> To improve and test poppler performance, I collected several 'random'
> PDFs and measured single page rendering time.
> My test framework (bash + R scripts) can be found at:
> http://students.mimuw.edu.pl/~pw248348/opos/perf_tests.tar.gz so you can
> run tests on different set of PDFs.
> The framework simply compares two git refs and produces CPU time and
> memory usage plots. See README for details.
> 
> Below is the list of patches:
> 
> 01_get_whole_rgb_line.patch
> [GfxICCBasedColorSpace, SplashOutputDev] avoid unnecessary lcms calls

Seems ok, have applied it locally and will see if it creates any regression

> 
> 03_cal_gray_reuse_pow_computations.patch
> GCC does not reuse subsequent pow() calls with the same arguments.
> BTW I was considering fast but less accurate pow() implementation but I
> don't know how how accurate it should be for color conversion and
> anti-aliasing.

Seems ok, have applied it locally and will see if it creates any regression

> 
> 04_fast_indexed_dict_lookup.patch
> Some PDFs contain large amounts (2000+) of Objects, e.g. publications
> with plots (grid, samples, lines, bands, axes, legend, scales). This
> results in 2000^2 strcmp() calls.
> Now if Dict contains > 10 objects, object-key index is built to reduce
> lookup time.

Seems a good idea, but at the moment we have a policy of not using std:: so 
that would make the patch rejected. Will have a look to see how much we really 
win and if would make sense loosening that policy if the win is enough.

> 
> 05_bigger_glyph_pixmap_cache.patch
> Previously glyph cache hit rate was about 10-20%. Increasing its size 6
> times (up to 1.5 MB additional memory) results in 30-60% hit rate and
> 3-12% performance increase.

Seems a good idea, will have a look in the coming days.

> 
> 06_splash_forced_inlines.patch
> Sometimes GCC refuses to inline function marked as inline. IMO Splash
> pipe has to be inlined so I forced gcc to do that.

No idea which compiler you use but gcc 4.4.3 with -O2 inlines all/most those 
methods for me already 

> 
> 07_inline_lexer_calls.patch
> 08_inline_parser_calls.patch
> 11_flatten_lexer_and_parser.patch
> Both lexer and parser calls should be inlined to avoid millions of
> unnecessary calls and allow compiler to make further optimizations.
> (*.so size +12KB, CPU time -8%)

Will have a look at this

> 
> 09_misc_goo_optimizations.patch
> some small GooHash and GooString optimizations

gcc 4.4.3 with -O2 inlines those already, no win

> 
> 10_stream_inlines.patch
> CCITTFaxStream and FlateStream inlines

Will have a look at this in the coming days

> 
> 12_splashxpath_modern_sort
> using qsort() from stdlib is rather bad idea

Same problem of using std:: here, will have to evaluate how big is the win to 
break the policy.

Thanks for the patches.

Albert

> 
> You can find them at: http://students.mimuw.edu.pl/~pw248348/opos/patches/
> 
> Best regards,
> Paweł Wiejacha.
> 
> [1] '50% performance increase' means: 50 = 100 *
> (old_cpu_time/new_cpu_time - 1)
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler


More information about the poppler mailing list