[poppler] [PATCH] ~10% speedup for loading/parsing a PDF file through simple GooString optimization

Leonard Rosenthol leonardr at pdfsages.com
Mon Aug 14 04:23:04 PDT 2006


At 12:03 AM 8/14/2006, Krzysztof Kowalczyk wrote:
>Looking at the profile data, it looks like a lot of allocations are
>due to copying between instances of Object that are obtained from
>Lexer::getObj(). The best solution would probably be to rework
>Lexer::getObj() and its callers to not do copying.

         Seems reasonable.


>However, the code is very undisciplined about how it copies data
>(sometimes it's the expensive, deep copy, sometimes it's a shallow
>copy that just copies data like Object obj = *objOrig). Given that
>it's hard to make a change without breaking stuff. My attempt at a
>simple change like making Object::string an embedded value instead of
>a pointer failed and I don't even understand why.

         What about using a smart pointer?  Something as simple as 
std::auto_ptr<> for a start - or possibly going all the way to 
boost::shared_ptr<>.


>Another big area of improvement would be Stream implementation.
>Currently e.g. Lexer::getChar() is at the top of the profile while the
>time spent actually reading the file doesn't even register. Those
>layered virtual calls hurt a lot.

         That's because you have a lot of characters to read...I 
don't think this is really going to gain you anything since the data 
has to be parsed anyway.  I think the allocation of the objects FROM 
the parsed data is indeed where you will get the gain.


Leonard

---------------------------------------------------------------------------
Leonard Rosenthol                            <mailto:leonardr at pdfsages.com>
Chief Technical Officer                      <http://www.pdfsages.com>
PDF Sages, Inc.                              215-938-7080 (voice)
                                              215-938-0880 (fax)



More information about the poppler mailing list