[poppler] Question on how to ensure API compatibility

Adam Reichold adamreichold at myopera.com
Wed Aug 8 00:00:27 PDT 2012


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

On 07.08.2012 22:50, Albert Astals Cid wrote:
> El Diumenge, 5 d'agost de 2012, a les 17:51:01, Adam Reichold va
> escriure: On 05.08.2012 17:26, Albert Astals Cid wrote:
>>>> El Diumenge, 5 d'agost de 2012, a les 17:16:46, Adam Reichold
>>>> va
>>>> 
>>>> escriure: On 05.08.2012 16:36, Ihar `Philips` Filipau wrote:
>>>>>>> On 8/5/12, Albert Astals Cid <aacid at kde.org> wrote:
>>>>>>>>> Exactly, that's why I am unsure whether I can
>>>>>>>>> change
>>>>>>>>> 
>>>>>>>>> QImage renderToImage(double xres=72.0, double
>>>>>>>>> yres=72.0, int x=-1, int y=-1, int w=-1, int h=-1,
>>>>>>>>> Rotation rotate = Rotate0) const;
>>>>>>>>> 
>>>>>>>>> to
>>>>>>>>> 
>>>>>>>>> QImage renderToImage(double xres=72.0, double
>>>>>>>>> yres=72.0, int x=-1, int y=-1, int w=-1, int h=-1,
>>>>>>>>> Rotation rotate = Rotate0, bool multiThreading =
>>>>>>>>> false) const;
>>>>>>>>> 
>>>>>>>>> in "Poppler::Page" defined in
>>>>>>>>> "qt4/src/poppler-qt4.h" without breaking something.
>>>>>>>>> Recompiling is obviously fine, but would
>>>>>>>>> applications that were linked against Poppler
>>>>>>>>> before that change still work?
>>>>>>>> 
>>>>>>>> No, they wouldn't.
>>>>>>>> 
>>>>>>>> Here a nice overview of the dos and donts. 
>>>>>>>> http://techbase.kde.org/Policies/Binary_Compatibility_Issues_With_C++
>
>>>>>>>> 
> Extremely valuable link. Thanks.
> 
>>>>>>> On topic. Quote from the link: -- If you need to add 
>>>>>>> extend/modify the parameter list of an existing
>>>>>>> function, you need to add a new function instead with
>>>>>>> the new parameters. In that case, you may want to add a
>>>>>>> short note that the two functions shall be merged with
>>>>>>> a default argument in later versions of the library:
>>>>>>> 
>>>>>>> void functionname( int a ); void functionname( int a,
>>>>>>> int b ); //BCI: merge with int b = 0 -- The open
>>>>>>> question is whether the "need" is there.
>>>> 
>>>> For a simple document (some mathematical text), the average
>>>> time for the rendering alone goes from *(46 +- 10) ms*
>>>> per-page (using Poppler 0.20.2) to *(69 +- 20) ms* per-page
>>>> (using Poppler master with Thomas' patch) on my system.
>>>> (Using 100 samples in both cases.)
>>>> 
>>>>> Well, 20 msec is something i can live with, i'm more scared
>>>>> on "big documents" if it adds something like one second or
>>>>> half a second :D
> 
> Ok, I tested a very image-heavy document where the benefits of 
> multi-threading a rather pronounced and the rendering time
> per-page went up from (945 +- 270) ms to (1131 +- 256) ms. But of
> course, the overall rendering time went down significantly as all
> three CPU cores where constantly maxed out using multi-threading
> whereas two were always idling without it.
> 
> So in this case, the overhead decreased from 33% to 16%. But, the 
> document is image-heavy, not complicated w.r.t to the structure.
> 
> Also, I noticed that Poppler::Page::renderToPainter creates a 
> ArthurOutputDev for each call as well and always creating a 
> SplashOutputDev
> 
>> renderToPainter creates a SplashOutputDev?

Obviously not since renderToPainter does not support using Splash, but
it creates an ArthurOutputDev for every call and it is used in
renderToImage, i.e. ArthurOutputDevs are never cached.

Since also the underlying SplashBitmap gets dropped at the end of
renderToImage, it seems reasonable not to cache the SplashOutputDev as
well which also simplifies things in Poppler::Document and
Poppler::DocumentData.

> would have the added benefit of removing all output device handling
> from Poppler::Document and Poppler::DocumentData. (I.e.
> Poppler::DocumentData::getOutputDev could go the way of the Dodo 
> and things like Poppler::Document::setPaperColor and
> ::setRenderHints would simplify.)
> 
>>>> I changed Poppler::Page::renderToImage to just create a
>>>> temporary SplashOutputDev for every call and used the fMT
>>>> flag, so the overhead is probably mostly creating the output
>>>> device. Not sure whether caching the output device in
>>>> PageData is good idea... (I think it would be ideal to have
>>>> as many output devices as rendering threads but IMHO that
>>>> would imply too close a coupling to the application using the
>>>> library.)
>>>> 
>>>>> Yeah well, creating an SplashOutpuDev is not "cheap" since
>>>>> it creates the SplsahBitmap and Splash.
>>>> 
>>>> Maybe you could tell what kind of data you'd like to see so
>>>> that I could try to provide it? Not sure how helpful such a
>>>> simple average is.
>>>> 
>>>>> Well, as far as i can see the only overhead is the copying
>>>>> of the xref, what about a program that calls the xref
>>>>> copying function so I can call it over all the thousand pdf
>>>>> i have lying around to see how much time takes on them?
> 
> I am not sure I am versed enough in the internal Poppler API to 
> achieve this. Could you give me a hint where in the code base I
> could find a similar test?
> 
>> Just to a sample program that does foo = new PDFDoc() 
>> foo->getXRef()->copy();
> 
>> And check how much time it takes

The attached program hopefully does that and testing it on my limited
collection of PDF files and averaging over 10000 calls to XRef::copy
and deletion of the copy, it usually takes between 0.01 to 0.02
milliseconds but always less than 0.1 milliseconds on my machine. (CPU
is an AMD A6-3500.)

Best regards, Adam.

>> Cheers, Albert
> 
> 
>>>>> Albert
>>>>> 
>>>>>>> Wrt, multithreading. Just a thought. I had impression
>>>>>>> that it should be already possible to create a private
>>>>>>> instance (per thread) of the document for the same PDF,
>>>>>>> so that the threads can rasterize the pages of the same
>>>>>>> PDF in parallel. Only trade off is the memory
>>>>>>> consumption.
>>>> 
>>>> We previously did this in qpdfview but I don't like the idea
>>>> of reparsing the document over and over to render each page.
>>>> It also adds problems for password-protected documents and
>>>> the semantics of what happens when the file changes on-disk
>>>> are different.
>>>> 
>>>> Regards, Adam.
>>>> 
>>>>>>> _______________________________________________
>>>>>>> poppler mailing list poppler at lists.freedesktop.org 
>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>> 
>>>>> _______________________________________________ poppler
>>>>> mailing list poppler at lists.freedesktop.org 
>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>> 
>>>> _______________________________________________ poppler
>>>> mailing list poppler at lists.freedesktop.org 
>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
> 
>> _______________________________________________ poppler mailing
>> list poppler at lists.freedesktop.org 
>> http://lists.freedesktop.org/mailman/listinfo/poppler
> _______________________________________________ poppler mailing
> list poppler at lists.freedesktop.org 
> http://lists.freedesktop.org/mailman/listinfo/poppler
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJQIg6LAAoJEPSSjE3STU34BwkIAIm12BQf04SFJQ0SZ4ALhw9c
O6nM+AR1stsXLdd0lwfxHNHTd8M4aDue5cBTxDAsLuVotO/F5pBYKaeTlmt5J5kX
XHlJdRXsJjrn740ucnNloUoAsmUVi+4CXtmXUkFh16oNjpx0/nUJpYlbXMKrXG4I
XJoIVH+cwfjBlplIWcj+xZ/CI0OOdYeYDUHqXAu3VeFwlccURc1FxqBsGRbnDOLz
fu+WqW6BpEjopU2148Ea8IyT6eLnpPvA3Qjjb++VuSUgLcA8Z85byKbqmzaD58PF
BnurdXyY5iD/zLWCW22+mDYETrvOdRwDFMXV8ZPAEUy6Vb2UIjcLcv+EdFjPo1Q=
=HNtC
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_xref_copy.cc
Type: text/x-c++src
Size: 630 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20120808/409dfe46/attachment.cc>


More information about the poppler mailing list