[poppler] patch for a progress callback

Adrian Johnson ajohnson at redneon.com
Tue Jan 8 04:30:14 PST 2013


On 07/01/13 21:09, lists at ds.com wrote:
> void (*progressCbk)(int pageNum, float progressPct, void* user_data)
> = NULL, void *progressCbkData = NULL );

I would use a double for the progress.

>> * The DummyOutputDev

NullOutputDev would be a better name. But I would prefer the pre-
rendering step be avoided.

> On 06/01/13 18:35, Adrian Johnson <ajohnson at redneon.com> wrote:
>> Have you got a sample PDF we can test with? It would be interesting
>> to see if there is anything that can be done to speed up the
>> rendering.
> 
> https://dl.dropbox.com/u/16106653/pdfToImage-progress-example.zip
> 
> Note: the PDFs in this zip will probably take a very, very long time
> to render with standard desktop PDF viewers. The first one
> 2012_12_28_13_6_22.pdf (generated by Cairo) is reasonably well
> behaved. The second one 2012_12_28_13_6_22_0,5stroke-1.pdf (generated
> by loading 2012_12_28_13_6_22.pdf into Adobe Illustrator and
> modifying the stroke width for all of the lines) seems to have
> multiple embedded streams, but is typical of the documents I was
> working with in making this patch.
> 
> After compiling pdfToImage-progress.cpp, try: ./a.out -i
> 2012_12_28_13_6_22.pdf -o test.png --width 12000 (12000 pixels is the
> image size necessary to produce a ~100cm wide print at 300DPI) (This
> is a direct output from an openFrameworks application that draws
> several thousand line segments every frame, running at 60fps for a
> minute or two)
> 
> For a poor performance example, try ./a.out -i
> 2012_12_28_13_6_22_0,5stroke-1.pdf -o test.png --width 3508 (3508
> pixels wide is a DIN-A4 print at 300DPI) (This is generated from the
> same data file as above, after loading it into Adobe Illustrator,
> selecting all the lines and setting their stroke width to 0.5 -- yes,
> there are better ways to achieve this than via a GUI application, but
> this is the use case I am working from.)

I timed both pdfs with splash and cairo using a width/height of
12000/9000 for both. The results (in mins:seconds) are interesting:

                                       splash       cairo
2012_12_28_13_6_22.pdf                  4:47         5:29
2012_12_28_13_6_22_0,5stroke-1.pdf    276:40        11:01

The second pdf contains each operation in separate a transparency group
which splash handles poorly. A more efficient way to change the line
width from 1 to 0.5 would be to uncompress the file with pdftk, replace
the "1 w" with "0.5 w" then run it through pdftk again to fix up the
xref. The process could be scripted.

>> Instead of parsing the content twice to get the operation count
>> you could use the current position in the content stream to report
>> progress. You would have to use the compressed stream position 
>> (getBaseStream()->getLength()) since the uncompressed length of a
>> stream is not stored in the pdf file.
> 
> I initially attempted that. With the
> 2012_12_28_13_6_22_0,5stroke-1.pdf from the zip linked above, it
> seems there are multiple streams embedded in the PDF, as the stream
> position pointer keeps on jumping back to zero. I'm assuming there
> are multiple streams, which means you'd need to do a preprocessing
> step which involves looping through the PDF, adding up all the stream
> lengths, and then actually rendering .... which is exactly what I've
> done with this patch, but counting operations rather than stream
> lengths.

See the attached patch which implements the stream position tracking.
There is a big difference. Counting the operations requires
uncompressing and parsing the entire stream. The attached patch just
adds up the length of each stream. As the stream length is already
available in the stream objects passed to Gfx, no processing effort is
incurred to extract the lengths.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: use-file-position.diff
Type: text/x-patch
Size: 6148 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20130108/e2a88686/attachment.bin>


More information about the poppler mailing list