[Poppler-bugs] [Bug 68322] pdftops 20 Aug 2013 is about 10% slower than last year

Tue Nov 12 21:07:47 PST 2013

https://bugs.freedesktop.org/show_bug.cgi?id=68322

--- Comment #7 from William Bader <williambader at hotmail.com> ---
Created attachment 89118
  --> https://bugs.freedesktop.org/attachment.cgi?id=89118&action=edit
patch for Splash::expandRow() with memcpy instead of a for loop

>Hmmmm, isn't that just special casing for your file?

Yes, I think that it might be over-tuning.  I had expected that most of the
expansions would be simple ratios that didn't need real number interpolation or
possibly not any interpolation at all, but when I put in debug code, all of my
test files had complicated ratios.  My test files are newspaper advertisements,
and the artists seem to make the bitmapped images fit into boxes, and the size
of the images and boxes have no relation.

Only the first half of my new loop can be converted into a memcpy().
The second half with
dstBuf[nComps*x + c] = (srcBuf[nComps*p + c] + srcBuf[nComps*(p+1) + c]) / 2
has to stay a loop, although it might be possible to use the PAVGB instruction.

You were right that memcpy was faster than a for loop.  I am using gcc-4.7.2 on
64 bit Fedora 17, and I had thought that the optimizer would be able to figure
it out.  A for loop with only one statement in its body shouldn't be that hard
for a compiler to optimize.  Maybe gcc is afraid that the source and
destination might overlap.

I repeated the runs a few times with all files and executables on a ram disk.
I have the real time and user time in seconds for pdftops -eps -level1sep
/tmp/1288986-xpdfshadebug.pdf /tmp/x.eps

original git source   10.573r 10.444u
my old for loop patch 10.461r 10.344u 
my new memcpy patch   10.448r 10.308u

I have no special need to optimize this file specifically.  It is only an
example that shows the cost of the bilinear scaling.  Most of my files don't
exercise this code or else run too quickly to give useful measurements.

These patches are only an experiment to see if I could recover some of the
performance penalty of the bilinear scaling by making some special cases that
don't need real number calculations on each pixel, but it looks like the
special cases are too rare to have a big effect.

William

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20131113/ec53395b/attachment.html>