[poppler] Multi-threading rendering on Raspberry Pi
pqt at LEFerguson.com
pqt at LEFerguson.com
Wed Feb 22 19:20:53 UTC 2017
From: Thomas Freitag
Sent: Wednesday, February 22, 2017 10:44 AM
To: poppler at lists.freedesktop.org
Subject: Re: [poppler] Multi-threading rendering on Raspberry Pi
>> So I guess that in your case a lot of time is needed in parsing the
>> PDF objects and not so much in rendering them.
> That's actually a bit of my bafflement as well. Mostly these are PDF documents that contain nothing at all except one image per page - no text, no annotations. I take X TIFF's in photoshop, and produce an X page PDF. So really it should be spending very little time parsing.
But time reading: the file handle of the PDF document is also shared between the threads of course, so only one thread can read at currently.
And every read costs a new positioning in this case. But still this is just a guess.
I thought I would share a couple of numbers in case anyone has interest. Bear in mind the Pi 3B is a very different environment from most people's desktop even if it will run the same software.
I switched my code so that it opened the PDF document in a whole separate instance in each thread, so the open, any initialization, etc. is done twice (in this test only 2 threads) but presumably there is no cross thread locking since they are separate instances.
It actually is faster that way, but not much.
One thread at a time (one open): 5.7 seconds then 5.6 seconds, total 11.3 seconds for two.
Two thread, one open: 8.6 second, then 50ms for second page, about 8.6 seconds total for two.
Two threads, two opens: 7.2 seconds for first, 200ms more for second, 7.4 seconds for two.
I ran all these multiple times, the times are quite consistent. While there may be other issues (e.g. calculations done later on first use that are duplicated), the open itself is very fast, the ms resolution timer shows zero.
I'm not sure the difference is worth the change for me -- it's about a second faster (I'm testing with high res monitor and image, so in real use it is much less than a second). But it is interesting.
I also did some experimentation with different image resolution on the (only) image on each page. Lower resolution makes a much bigger difference. So I think the real issue here may be in image manipulation. I may dig in a bit and see what it's doing under the covers -- not even sure how Photoshop is storing the embedded image when it converts.
But... back to threading... if I take for granted that separate document instances are independent, it looks like there is a small slowdown in one instance and multiple threads, but not much (at least in this environment, these documents, so not a meaningful number in terms of generalization).
Thanks again for listening.
More information about the poppler