[poppler] RFC: upstream optional threading support in pdftoppm for simple testing

Adam Reichold adamreichold at myopera.com
Mon Apr 8 08:46:17 PDT 2013


Hello,

Am 08.04.2013 06:53, schrieb Thomas Freitag:
> Am 08.04.2013 00:42, schrieb Albert Astals Cid:
>> El Diumenge, 7 d'abril de 2013, a les 22:07:46, Thomas Freitag va
>> escriure:
>>> Am 07.04.2013 21:12, schrieb Albert Astals Cid:
>>>> El Diumenge, 7 d'abril de 2013, a les 16:31:52, Adam Reichold va
>>>> escriure:
>>>>> Hello,
>>>>>
>>>>> Am 07.04.2013 16:13, schrieb Albert Astals Cid:
>>>>>> El Dissabte, 6 d'abril de 2013, a les 17:43:54, Adam Reichold va
>> escriure:
>>>>>>> Hello,
>>>>>>>
>>>>>>> Am 06.04.2013 17:14, schrieb Albert Astals Cid:
>>>>>>>> El Divendres, 5 d'abril de 2013, a les 21:43:28, Adam Reichold va
>>>>>> escriure:
>>>>>>>>> Hello again,
>>>>>>>>>
>>>>>>>>> I was a bit in a rush at the first try. Sorry for that, I
>>>>>>>>> tidied it
>>>>>>>>> up
>>>>>>>>> slightly.
>>>>>>>> Maybe we should rename from UTILS_USE_THREAD to UTILS_USE_PTHREAD ?
>>>>>>>>
>>>>>>>> Or add a comment somewhere that we only support pthreads for now
>>>>>>>> somewhere?
>>>>>>> I would be fine with both.
>>>>>>>
>>>>>>> Actually, since this is mostly meant for testing, I would be fine
>>>>>>> with
>>>>>>> not making it accessible via autotools or CMake at all, i.e. just
>>>>>>> add
>>>>>>> the definition to 'config.h' manually when and if we need it.
>>>>>> Makes sense to me, code-wise what's the difference between this
>>>>>> and the
>>>>>> code Thomas posted in the threading bug? Do you think this is
>>>>>> simpler/easier to understand?
>>>>> Yes, the difference is that I left out the Windows-specific part and
>>>>> tried to keep it as simple as possible. For example, I think
>>>>> synchronizing on the job queue is simpler than synchronizing on the
>>>>> thread state. But of course, my implementation is not very
>>>>> efficient in
>>>>> terms of performance, just sufficient for testing.
>>>> Thomas would you be OK if we merge this patchset or you'd prefer yours
>>>> (more complex?) to be in?
>>> Oh, I never thought to get a question like this. To answer it, I need to
>>> to go a little bit more far afield: When I started to implement it,
>>> first of all I need a test case. Therefore I made that hack to pdftoppm
>>> to use multiple threads under Windows (still my favorite programming
>>> platform, I'm too old to change my habits), the pthread version I made
>>> much later to run it over the PDF suite and so that You can test it,
>>> too. But it was never made to publish it.
>>> I think that a multi
>>> threading version of pdftoppm is not really necessary for the community.
>> If we replace community by final user, agreed.
>>
>>> BUT: because we now support multi threading, we need a test case in
>>> general. I haven't review Adam's patch in detail, but if You think it is
>>> sufficient as test case (and I don't think, that anyone need a Windows
>>> test case), I can live with it. If You otherwise mean that we need a
>>> more general support also on Windows, I'll spend one or two weekends to
>>> revise my solution.
>> Honestly *I* don't need Windows support, but that's not *my* project
>> but a
>> community one and you seem to need it so we need it.
>>
>> What I didn't originally like about your patch is that it was a bit
>> too much
>> intrusive for my taste, and to be honest Adam's a bit too.
> The advantage of a "pdftoppm" test case is that we can use it in the
> regression test without any manipulation and can compare the multi
> threading results with the "normal" results. And with the actual state
> we still have different results
> 1. with damaged PDF where the "normal" pdftoppm produces output for
> pages which can not be rendered by copying the last producable page.
> 2. with the type 3 fonts where the caching produces "random" result.
> 3. with non embedded fonts where the decision which external font should
> be used is sometimes different.
> But to solve this even I don't need a windows solution. I did need it
> only in the past to set breakpoints to have an easier look at variables,
> memory and call stack.

Personally, I see the problem with touching the 'pdftoppm' code which is
why I tried to keep the code as simple and small as possible. (Also,
without the 'UTILS_USE_THREADS' definition, the compiler sees exactly
the same 'pdftoppm' as before.) But I would vote for a solution that
integrates with the regression test framework which is why I chose
'pdftoppm' in the first place.

Best regards, Adam.

>>
>> So that got me thinking into the fact that we don't really need much
>> of the
>> pdftoppm code, we just need to create a SplashOutputDev and feed it
>> into a few
>> threads.
>>
>> So what do you think of "separate" non pdftoppm code in the test/ folder?
>>
>> This way we can even have a thread_pthread_test.cpp and a
>> thread_windows_test.cpp, etc. without "polluting" the regular pdftoppm
>> code.
> What do You think about Adam's qt testcase? It has the advantage of not
> only test the rendering, even if it need some enhancements, because You
> still cannot modify the same annotation from different threads at the
> same time.
> 
> Cheers,
> Thomas
>>
>> Thoughts?
>>
>> Cheers,
>>    Albert
>>
>>> So I band the ball back to You.
>>> Cheers,
>>> Thomas
>>>
>>>> Cheers,
>>>>
>>>>     Albert
>>>>> Best regards, Adam.
>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>>     Albert
>>>>>>> Best regards, Adam.
>>>>>>>
>>>>>>>> Besides that it looks ok-ish in a quick look.
>>>>>>>>
>>>>>>>> Anyone has a comment?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>>     Albert
>>>>>>>>> Best regards, Adam.
>>>>>>>>>
>>>>>>>>> Am 05.04.2013 19:27, schrieb Adam Reichold:
>>>>>>>>>> Hello everyone,
>>>>>>>>>>
>>>>>>>>>> To make it easier for us to test changes w.r.t. to threading, I
>>>>>>>>>> would
>>>>>>>>>> propose to commit a simple implementation of threading in
>>>>>>>>>> 'pdftoppm'
>>>>>>>>>> to
>>>>>>>>>> master.
>>>>>>>>>>
>>>>>>>>>> The attached patch contains a very simple implementation that
>>>>>>>>>> is not
>>>>>>>>>> focused on maximal performance but should suffice to stress the
>>>>>>>>>> locking
>>>>>>>>>> inside Poppler's core. I opted to implement only the POSIX
>>>>>>>>>> approach
>>>>>>>>>> since I suppose POSIX systems are where most of us test and
>>>>>>>>>> the code
>>>>>>>>>> is
>>>>>>>>>> hopefully simple and short enough not become a maintenance
>>>>>>>>>> burden.
>>>>>>>>>>
>>>>>>>>>> What do you think?
>>>>>>>>>>
>>>>>>>>>> Best regards, Adam.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> poppler mailing list
>>>>>>>>>> poppler at lists.freedesktop.org
>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>> _______________________________________________
>>>>>>>> poppler mailing list
>>>>>>>> poppler at lists.freedesktop.org
>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>> _______________________________________________
>>>>>>> poppler mailing list
>>>>>>> poppler at lists.freedesktop.org
>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>> _______________________________________________
>>>>>> poppler mailing list
>>>>>> poppler at lists.freedesktop.org
>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>> _______________________________________________
>>>>> poppler mailing list
>>>>> poppler at lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>> _______________________________________________
>>>> poppler mailing list
>>>> poppler at lists.freedesktop.org
>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>
>>>> .
>>> _______________________________________________
>>> poppler mailing list
>>> poppler at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>> _______________________________________________
>> poppler mailing list
>> poppler at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>
>> .
>>
> 
> 
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
> 


More information about the poppler mailing list