[poppler] A commit in git doubled rendering time for some PDFs
brian.ewins at gmail.com
Mon Nov 1 03:48:01 PDT 2010
On 23 October 2010 07:37, Dennis Sheil <dennis-poppler at vartmp.com> wrote:
> I've been looking at a commit that more than doubled rendering speed
> for some PDFs. The commit is from November 2009, but it just hit the
> mainstream vanilla stable Ubuntu platform with the release of Ubuntu
> 10.10 last week (the commit is not to be found in the Debian unstable
> poppler package as of yet). The commit is
> f83b677a8eb44d65698b77edb13a5c7de3a72c0f . The purpose of the commit
> was to fix the order of text selection in column displays.
> I tested the speed of rendering on three poppler commits - the latest
> one (c64a49307782299cb7a950a66419f9d59707f38b), the troubling one
> (f83b677a8eb44d65698b77edb13a5c7de3a72c0f) and two commits prior to
> the troubling one (345ed51af9b9e7ea53af42727b91ed68dcc52370). I first
> tested rendering on the PDF on which I noticed the problem, which is a
> PDF which shows bus routes which are local to me (
> http://www.mta.info/nyct/maps/busqns.pdf ). For the latest commit
> (c64a...) and the troubling one (f83b...) rendering time was more than
> twice that of the commit two prior to the troubling one (34e...). On
> my desktop, which has a 2.8 Ghz dual core CPU, the post-commit
> rendering took over 8 seconds, the pre-commit rendering was less than
> 3 seconds.
> I wanted to test another PDF, so I randomly searched Google for map
> PDFs and grabbed another PDF to test -
> . Rendering on the pre-f83b commit was 50% faster than from the f83b
> commit and the latest (c64a...) commit. So this is not just on that
> one PDF.
> Another thing I did was from the latest (c64a...) git commit I
> rendered against the latest code, and also rendered against that
> latest commit, but with the call to the TextBlock class's
> visitDepthFirst method from the TextPage class's coalesce method
> commented out. I got similar results to above - with the method
> uncommented, rendering time increased from 50% to over 100%.
> I mentioned this on Freenode IRC channel #poppler. Someone, I believe
> it was Andrea Canciani, looked at the visitDepthFirst method and said
> the two nested loops looked wrong. I put a counter in
> visitDepthFirst's inner for loop to see how many times it ran on my
> one-page 751K PDF. The inner for loop ran over 196 million times!
> That's 32 times for every bit my PDF file has. From within the inner
> for loop, if the TextBlock data structure blk3 is not equal to either
> blk1 or blk2, it will make not one, but two calls to the
> isBeforeByRule1 method. I didn't count how many times that hit, but
> profiling showed isBeforeByRule1 was called quite a lot. No wonder my
> PDFs are rendering slow.
> Anyhow, as I said, that f83b... commit is said to have fixed sort
> order for text selected from columns. Although it also more than
> doubling the render speed of some of these PDFs, in which there is no
> need for this topological sorting. So that would be my problem
> statement. Next step is the solution - keeping the old render time
> (or at least something near it, in cases where topological sorting is
> not important) but still keeping the fix that orders selection of text
> from columns properly.
Holding my hand up for this one... that's my code, and it's not very
smart. I had a look and here's some ways we can improve this.
- the outer loop calling visitDepthFirst, and the outermost loop in
visitDepthFirst, both loop over all blocks. Both loops, but
particularly the second one, should continue with the next unvisited
block (so we need to track what that is).
- we should pre-sort the blocks into primary page order (increasing y,
in the usual case) in order to enable a few further optimisations, as
- instead of looping over all blocks and testing 'isBeforeByRule1',
that test should only happen for unvisited blocks before block1 in y
order. That ensures one half of the isBeforeByRule1 check, so we only
need to check if the blocks overlap.
- the isBeforeByRule2 check only applies to blocks /after/ block1 in y
order; make this a separate loop.
- For rule2, instead of checking individually whether a 'block3'
exists between block1 and each block2 in this loop, keep track of a
window that widens from the width of block1 to cover any blocks the
window overlaps. It's not quite the same as the current rule2 check
but it's way more efficient, discarding the entire inner loop.
Together these changes should reduce the number of calls by a factor
of about (blocks)^3 (ish)...which is quite significant.
I'll see if I can get my build updated with this.
Thanks for reporting the problem.
> If I have the time, I will do more work on this problem. I am
> currently unsure of whether or not I will be able to find the time to
> do more work on it though.
> -- Dennis
> poppler mailing list
> poppler at lists.freedesktop.org
More information about the poppler