[poppler] RFC: patch: parallel testing

Carlos Garcia Campos carlosgc at gnome.org
Sat Dec 29 10:48:24 PST 2012


Adam Reichold <adamreichold at myopera.com> writes:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello Carlos,
>
> I think the problem also happens with a tty but it is not immediately
> obvious: The output will look fine, but e.g. the "FAIL" appended to
> the current line might belong to a different test case. Consider the
> following chain of events (serialized by the Printer lock) using two
> worker threads:
>
> thread A: self.printer.print_test_start("Test A: ")
> thread B: self.printer.print_test_start("Test B: ")
> thread A: self.printer.print_test_result_ln("FAIL")
>
> which as I understand it should result in
>
> "Test B: FAIL"
>
> whereas it was test A that actually failed.

Ok, I've pushed a different version. The number of the test is not that
important in the end, so I think we could assign the numbers when
printing the results to ensure we keep the order.

Another issue I've noticed with the threads is that poppler-regtest
can't be kille with CTRL + C anymore. 

> Best regards, Adam.

Thanks,

> Am 29.12.2012 17:13, schrieb Carlos Garcia Campos:
>> Adam Reichold <adamreichold at myopera.com> writes:
>> 
>> Hello Thomas,
>> 
>> It is not a very sophisticated solution, but I think the attached 
>> patch should ensure useful messages for interleaved output of
>> several worker threads.
>> 
>>> I hadn't noticed this problem, and I can only reproduce it when
>>> running the tests in verbose mode, does it only happen when
>>> stdout is not a tty?
>> 
>> Best regards, Adam.
>> 
>> Am 29.12.2012 10:32, schrieb Thomas Freitag:
>>>>> Hi Adam!
>>>>> 
>>>>> I used yesterday the first time Your patch for regtesting:
>>>>> Awfully, congratulations. You probably already encountered my
>>>>> comment on bug 50992. One minor thing: Unfortunately the
>>>>> protocol is not really readable anymore, because each thread
>>>>> i.e. first outputs
>>>>> 
>>>>> Testing '/media/thomas/HD-PCTU3/PDF Suite/01_ump_a_2009.pdf'
>>>>> using splash backend (9/1137):
>>>>> 
>>>>> and when it finishes i.e.
>>>>> 
>>>>> PASS
>>>>> 
>>>>> Because of the worker threads it's now nearly impossible to
>>>>> match the test itself with the result of the test. Can You
>>>>> change it in that way that this is outputted together? Should
>>>>> be quite easy....
>>>>> 
>>>>> Cheers, Thomas
>>>>> 
>>>>> Am 06.12.2012 19:02, schrieb Adam Reichold: Hello again,
>>>>> 
>>>>> I reformatted the patch hopefully solving the whitespace
>>>>> issues. (It does apply and run for me, but there are still
>>>>> some whitespace warnings.)
>>>>> 
>>>>> The use of recursive locks is probably not necessary in 
>>>>> "TestRun.py" and "TestReferences.py", but since the
>>>>> performance impact is probably negligible in this case, I
>>>>> thought allowing for recursion would help to keep the code
>>>>> more accessible. In "Printer.py" there is at least one
>>>>> recursion from "printout_update" into "printout".
>>>>> 
>>>>> Best regards, Adam.
>>>>> 
>>>>> Am 06.12.2012 18:19, schrieb Adam Reichold:
>>>>>>>> Hello Carlos,
>>>>>>>> 
>>>>>>>> Am 06.12.2012 18:11, schrieb Carlos Garcia Campos:
>>>>>>>>> Adam Reichold <adamreichold at myopera.com> writes:
>>>>>>>>>> Hello again,
>>>>>>>>>> 
>>>>>>>>>> As suggested by William Bader, I added the option
>>>>>>>>>> to use the logical CPU count as the number of
>>>>>>>>>> worker threads automatically.
>>>>>>>>> Thanks for the patch, any improvement in the
>>>>>>>>> regression test framework is more than welcome, even
>>>>>>>>> more if it's to improve the performance. What poppler
>>>>>>>>> have you sued to create the patches? They don't apply
>>>>>>>>> for me here using poppler from current git master.
>>>>>>>> Yes, it is about performance and I am hesitant about 
>>>>>>>> introducing such complexity into the test driver which
>>>>>>>> is supposed to test the correctness of program itself,
>>>>>>>> but faster regression testing probably means more
>>>>>>>> regression testing... :-)
>>>>>>>> 
>>>>>>>> I am not sure how to correctly format the patch. I
>>>>>>>> built on the current master but it seems Git thinks
>>>>>>>> that the lines where I just changed the indentation are
>>>>>>>> whitespace errors. I am not sure how to correct this as
>>>>>>>> changes in whitespace are obviously important for
>>>>>>>> Python scripts... I'll try to research this, but any
>>>>>>>> advice would be welcome.
>>>>>>>> 
>>>>>>>> Best regards, Adam.
>>>>>>>> 
>>>>>>>>>> Regards, Adam.
>>>>>>>>>> 
>>>>>>>>>> Am 04.12.2012 12:28, schrieb Adam Reichold:
>>>>>>>>>>> Hello Thomas,
>>>>>>>>>>> 
>>>>>>>>>>> Am 04.12.2012 08:41, schrieb Thomas Freitag:
>>>>>>>>>>>> Am 04.12.2012 07:45, schrieb Adam Reichold:
>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>> 
>>>>>>>>>>>> I currently try to get myself acquainted with 
>>>>>>>>>>>> Poppler's regression testing framework. Because
>>>>>>>>>>>> my system has a rather low single-threaded
>>>>>>>>>>>> performance, I tried to implement parallel
>>>>>>>>>>>> testing using Python's Queue class.
>>>>>>>>>>>> 
>>>>>>>>>>>> Even though poppler-regtest currently uses two
>>>>>>>>>>>>  processes per test file, rendering even and
>>>>>>>>>>>> odd pages respectively, the test files
>>>>>>>>>>>> themselves are still handled sequentially and
>>>>>>>>>>>> both process are joined for each test file.
>>>>>>>>>>>> This will yield suboptimal system utilization
>>>>>>>>>>>> even for a small three-core system like mine.
>>>>>>>>>>>> 
>>>>>>>>>>>> Using the "-t/--threads N" option in the
>>>>>>>>>>>> patched poppler-regtest will spawn N worker
>>>>>>>>>>>> threads that handle all tests they can get from
>>>>>>>>>>>> a single queue for all known tests, allowing to
>>>>>>>>>>>> heavily utilize also large system if using a
>>>>>>>>>>>> large set of test cases. But even for my
>>>>>>>>>>>> three-core system, this brought down the time
>>>>>>>>>>>> to create references for the complete test 
>>>>>>>>>>>> suite using the Splash backend from 4,5 hours
>>>>>>>>>>>> to 2,75 hours.
>>>>>>>>>>>>> What do You mean with "patched
>>>>>>>>>>>>> poppler-regtest"? There was no attachment, or
>>>>>>>>>>>>> do I miss something?
>>>>>>>>>>> No, this on the process level in the test driver
>>>>>>>>>>> and independent of the your multi-threading work.
>>>>>>>>>>> There was a patch called "parallel_testing.patch"
>>>>>>>>>>> attached, as least my mail client tells me so.
>>>>>>>>>>> 
>>>>>>>>>>> I'll just try again...
>>>>>>>>>>> 
>>>>>>>>>>> Best regards, Adam.
>>>>>>>>>>> 
>>>>>>>>>>>>> If You talk about my multi-threaded testcase,
>>>>>>>>>>>>>  please be aware, that it is still
>>>>>>>>>>>>> experimental, even if quite close. Here the
>>>>>>>>>>>>> result of my last regression test last sunday
>>>>>>>>>>>>> evening: Total 1133 tests 1114 tests passed
>>>>>>>>>>>>> (98.32%) 17 tests failed (1.50%):
>>>>>>>>>>>>> /media/thomas/HD-PCTU3/PDF Suite/Algorithmics
>>>>>>>>>>>>> - The Spirit of Computing, 3rd Ed.pdf
>>>>>>>>>>>>> (splash), /media/thomas/HD-PCTU3/PDF 
>>>>>>>>>>>>> Suite/Essentials of English Grammar - 
>>>>>>>>>>>>> www.ielts4u.blogfa.com.pdf ... 2 tests
>>>>>>>>>>>>> crashed (0.18%): /media/thomas/HD-PCTU3/PDF 
>>>>>>>>>>>>> Suite/bug157090.pdf (splash), 
>>>>>>>>>>>>> /media/thomas/HD-PCTU3/PDF
>>>>>>>>>>>>> Suite/sinatr4c.f5.pdf (splash) I'll first
>>>>>>>>>>>>> have a look at the crashes next weekend, then
>>>>>>>>>>>>> I'll continue with looking at the failed
>>>>>>>>>>>>> tests.... Cheers, Thomas
>>>>>>>>>>>> 
>>>>>>>>>>>> IMHO, the necessary changes seem quite small 
>>>>>>>>>>>> especially since a lot of them are connected to
>>>>>>>>>>>>  indentation handling. What are your thoughts
>>>>>>>>>>>> on the utility and implementation of this?
>>>>>>>>>>>> 
>>>>>>>>>>>> Best regards, Adam.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>  poppler mailing list
>>>>>>>>>>>>> poppler at lists.freedesktop.org
>>>>>>>>>>>>> 
>>>>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>
>>>>>>>>>>>>> 
> _______________________________________________ poppler
>>>>>>>>>>>> mailing list poppler at lists.freedesktop.org 
>>>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>
>>>>>>>>>>>> 
> _______________________________________________ poppler
>>>>>>>>>>>> mailing list poppler at lists.freedesktop.org 
>>>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>
>>>>>>>>>>>> 
> _______________________________________________ poppler mailing
>>>>>>>> list poppler at lists.freedesktop.org 
>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________ poppler
>>>>>> mailing list poppler at lists.freedesktop.org 
>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>> 
>>> From 5e7461569e3085370326203539fa02de968eb2ca Mon Sep 17 00:00:00
>>> 2001 From: Adam Reichold <adamreichold at myopera.com> Date: Sat, 29
>>> Dec 2012 15:17:33 +0100 Subject: [PATCH] print messages that are
>>> useful with interleaved output by several worker threads
>>> 
>>> --- regtest/TestRun.py | 18 +++++++++--------- 1 file changed, 9
>>> insertions(+), 9 deletions(-)
>>> 
>>> diff --git a/regtest/TestRun.py b/regtest/TestRun.py index
>>> 5564757..750b93c 100644 --- a/regtest/TestRun.py +++
>>> b/regtest/TestRun.py @@ -71,7 +71,7 @@ class TestRun: with
>>> self._lock: self._n_tests += 1
>>> 
>>> -        self.printer.print_test_start("Testing '%s' using %s
>>> backend (%d/%d): " % (doc_path, backend.get_name(), n_doc,
>>> total_docs)) +        test_msg = "Testing '%s' using %s backend
>>> (%d/%d): " % (doc_path, backend.get_name(), n_doc, total_docs) 
>>> test_has_md5 = backend.create_refs(doc_path, test_path)
>>> 
>>> if backend.has_stderr(test_path): @@ -81,24 +81,24 @@ class
>>> TestRun: if ref_has_md5 and test_has_md5: if
>>> backend.compare_checksums(refs_path, test_path, not
>>> self.config.keep_results, self.config.create_diffs,
>>> self.config.update_refs): # FIXME: remove dir if it's empty? -
>>> self.printer.print_test_result("PASS") +
>>> self.printer.printout(test_msg + "PASS") with self._lock: 
>>> self._n_passed += 1 else: -
>>> self.printer.print_test_result_ln("FAIL") +
>>> self.printer.printout_ln(test_msg + "FAIL") with self._lock: 
>>> self._failed.append("%s (%s)" % (doc_path, backend.get_name())) 
>>> return elif test_has_md5: if ref_is_crashed: -
>>> self.printer.print_test_result_ln("DOES NOT CRASH") +
>>> self.printer.printout_ln(test_msg + "DOES NOT CRASH") elif
>>> ref_is_failed: -
>>> self.printer.print_test_result_ln("DOES NOT FAIL") +
>>> self.printer.printout_ln(test_msg + "DOES NOT FAIL") return
>>> 
>>> test_is_crashed = backend.is_crashed(test_path) if ref_is_crashed
>>> and test_is_crashed: -
>>> self.printer.print_test_result("PASS (Expected crash)") +
>>> self.printer.printout_ln(test_msg + "PASS (Expected crash)") with
>>> self._lock: self._n_passed += 1 return @@ -106,19 +106,19 @@
>>> class TestRun: test_is_failed = backend.is_failed(test_path) if
>>> ref_is_failed and test_is_failed: # FIXME: compare status errors 
>>> -            self.printer.print_test_result("PASS (Expected fail
>>> with status error %d)" % (test_is_failed)) +
>>> self.printer.printout(test_msg + "PASS (Expected fail with status
>>> error %d)" % (test_is_failed)) with self._lock: self._n_passed +=
>>> 1 return
>>> 
>>> if test_is_crashed: -
>>> self.printer.print_test_result_ln("CRASH") +
>>> self.printer.printout_ln(test_msg + "CRASH") with self._lock: 
>>> self._crashed.append("%s (%s)" % (doc_path, backend.get_name())) 
>>> return
>>> 
>>> if test_is_failed: -
>>> self.printer.print_test_result_ln("FAIL (status error %d)" %
>>> (test_is_failed)) +            self.printer.printout_ln(test_msg
>>> + "FAIL (status error %d)" % (test_is_failed)) with self._lock: 
>>> self._failed_status_error("%s (%s)" % (doc_path,
>>> backend.get_name())) return -- 1.8.0.3
>>> 
>>> _______________________________________________ poppler mailing
>>> list poppler at lists.freedesktop.org 
>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.19 (GNU/Linux)
>
> iQEcBAEBAgAGBQJQ3xsCAAoJEPSSjE3STU34CZAIAKvyx1zQDNH5KNx4OfEUSWDy
> 8K5SV+J3UcK8HODqbQigzRJ4W7aVc6Hn/TPDXVVbktr7Et4NYmzZaPXIsJ0d2GUR
> k7S+XIZ1JbDP/IDzG9iPedMvFL7ZFnTForOS0+DQpDakEDrRMsd6/xhcT2dNDso7
> ssqMG4OW+wGAe4lv5CMNyS1I3dQrggTkOjMS4+yiTPmnkqb+WLA3Ufn9hCjl3AVz
> Euyni1z7l3oirajYoU6nf/nk9q/vCcFkHcKKkIFMyjE1VDIStSVLze9NqnyjT5ms
> Yy1nAi9mfDWj3/VrdLxmVipeYZHV9+yN4CuAecZPWRnBpy5n9RYDAzaQIE1Io2k=
> =lXUz
> -----END PGP SIGNATURE-----
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler

-- 
Carlos Garcia Campos
PGP key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x523E6462
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20121229/6a6b66a3/attachment-0001.pgp>


More information about the poppler mailing list