[poppler] More poppler speedups
kkowalczyk at gmail.com
Sat Sep 2 18:20:15 PDT 2006
On 9/2/06, Jeff Muizelaar <jeff at infidigm.net> wrote:
> > I think there are more opportunities for improvements like 2) although
> > even more improvements would come from improving various
> > Stream::getChar() methods (currently Lexter::getChar(),
> > EmbedStream::getChar() and FlateStream::getChar() are in top 5 of most
> > exensive methods during loading. I haven't yet found a way to improve
> > that.
> One of the ways that I looked at optimizing these was by adding a read()
> method to the Stream class that reads multiple bytes instead of a single
> one. I have a patch from a long time ago that adds something like this
> The problem I ran into was with things like inline images (EmbedStream).
> With these streams there is no way of knowing ahead of time how long the
> stream is so you have to be very careful not read more than you are
> supposed to. This is also the source of the current problem with the
> zlib based version of FlateStream.
> The solution, it seems, is just to be careful not to read more than 1
> byte when the stream does not have a limited length. I have a patch
> around that fixes the zlib-based FlateStream to only read as much as it
> needs so the read() method should be feasible. This should help drop the
> FlateStream::getChar() overhead as long as it isn't reading from an
Yeah, I tried that and hit the same problems. Also, when creating
embedded or filtered streams, they are created starting at current
position in the stream and caching breaks this.
I did manage to get speedups by improving looChar()/getChar() to cache
the latest value (they're frequently re-doing getting a char from the
substream) but my attempts at read() interface only generated
spectacular crashes. I'm still hopefull it's possible but need to
spend more time understanding the code.
More information about the poppler