[poppler] More poppler speedups
Krzysztof Kowalczyk
kkowalczyk at gmail.com
Sat Sep 2 17:31:17 PDT 2006
Hello,
I've made more progress in my quest for improving performance of poppler.
1. https://bugs.freedesktop.org/show_bug.cgi?id=8111
Currently every Dict::find(char *) (which are a majority) needs to convert char
* to UGooString, which involves memory allocations.
I've implemented ability to directly find(char *) by adding
UGooString::cmp(char *) that has the same semantics as
UGooString::cmp(UGooString(char*)).
Also converting tabs to spaces so that it always has correct indentation
(regardless of your current tab settings) and small refactoring to use
dictLookupBool() and dictLookupInt() for simpler code.
2. https://bugs.freedesktop.org/show_bug.cgi?id=8112
Currently during Parser::getObj() and when adding values to dictionaries poppler
makes unnecessary copies of objects. When objects are strings, this means
reallocating memory. Since Parser::getObj() is called (literally) in N*100k
times, those memory allocations contribute significantly to the execution time
(e.g. delete is 4th most expensive function).
I've made really only two small changes:
a) avoid memory copy by giving the caller an ownership of the object and nulling
the object in one code path
b) optimizing UGooString usage of key variable (it was always constructed but is
not always needed)
Also adds helper Dict::add* functions to make b) (and maybe other future changes
like that) possible.
This change gives me a consistent ~7% speedup for PDF loading stage.
Patches are attached to respective bugs. Or you can get this and my
other improvements from
http://blog.kowalczyk.info/software/sumatrapdf/develop.html
I think there are more opportunities for improvements like 2) although
even more improvements would come from improving various
Stream::getChar() methods (currently Lexter::getChar(),
EmbedStream::getChar() and FlateStream::getChar() are in top 5 of most
exensive methods during loading. I haven't yet found a way to improve
that.
-- kjk
Sumatra (PDF Viewer for Windows):
http://blog.kowalczyk.info/software/sumatrapdf
More information about the poppler
mailing list