[poppler] RFC: patch: parallel testing
Adam Reichold
adamreichold at myopera.com
Sat Dec 29 11:13:02 PST 2012
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello Carlos,
Am 29.12.2012 19:48, schrieb Carlos Garcia Campos:
> Adam Reichold <adamreichold at myopera.com> writes:
>
> Hello Carlos,
>
> I think the problem also happens with a tty but it is not
> immediately obvious: The output will look fine, but e.g. the "FAIL"
> appended to the current line might belong to a different test case.
> Consider the following chain of events (serialized by the Printer
> lock) using two worker threads:
>
> thread A: self.printer.print_test_start("Test A: ") thread B:
> self.printer.print_test_start("Test B: ") thread A:
> self.printer.print_test_result_ln("FAIL")
>
> which as I understand it should result in
>
> "Test B: FAIL"
>
> whereas it was test A that actually failed.
>
>> Ok, I've pushed a different version. The number of the test is
>> not that important in the end, so I think we could assign the
>> numbers when printing the results to ensure we keep the order.
>
>> Another issue I've noticed with the threads is that
>> poppler-regtest can't be kille with CTRL + C anymore.
Yes, the reason is that Queue.join() is a blocking call so that the
main thread can't handle the KeyboardInterrupt. There are several ways
to work around this, but I simply opted to print the PID before
spawning the worker threads to use with "kill". (Depending on the
shell used, one can suspend the task and use "kill %%".) More
complicated variants could poll the queue for unprocessed jobs and
handle the KeyboardInterrupt by clearing it.
Best regards, Adam.
> Best regards, Adam.
>
>> Thanks,
>
> Am 29.12.2012 17:13, schrieb Carlos Garcia Campos:
>>>> Adam Reichold <adamreichold at myopera.com> writes:
>>>>
>>>> Hello Thomas,
>>>>
>>>> It is not a very sophisticated solution, but I think the
>>>> attached patch should ensure useful messages for interleaved
>>>> output of several worker threads.
>>>>
>>>>> I hadn't noticed this problem, and I can only reproduce it
>>>>> when running the tests in verbose mode, does it only happen
>>>>> when stdout is not a tty?
>>>>
>>>> Best regards, Adam.
>>>>
>>>> Am 29.12.2012 10:32, schrieb Thomas Freitag:
>>>>>>> Hi Adam!
>>>>>>>
>>>>>>> I used yesterday the first time Your patch for
>>>>>>> regtesting: Awfully, congratulations. You probably
>>>>>>> already encountered my comment on bug 50992. One minor
>>>>>>> thing: Unfortunately the protocol is not really
>>>>>>> readable anymore, because each thread i.e. first
>>>>>>> outputs
>>>>>>>
>>>>>>> Testing '/media/thomas/HD-PCTU3/PDF
>>>>>>> Suite/01_ump_a_2009.pdf' using splash backend
>>>>>>> (9/1137):
>>>>>>>
>>>>>>> and when it finishes i.e.
>>>>>>>
>>>>>>> PASS
>>>>>>>
>>>>>>> Because of the worker threads it's now nearly
>>>>>>> impossible to match the test itself with the result of
>>>>>>> the test. Can You change it in that way that this is
>>>>>>> outputted together? Should be quite easy....
>>>>>>>
>>>>>>> Cheers, Thomas
>>>>>>>
>>>>>>> Am 06.12.2012 19:02, schrieb Adam Reichold: Hello
>>>>>>> again,
>>>>>>>
>>>>>>> I reformatted the patch hopefully solving the
>>>>>>> whitespace issues. (It does apply and run for me, but
>>>>>>> there are still some whitespace warnings.)
>>>>>>>
>>>>>>> The use of recursive locks is probably not necessary in
>>>>>>> "TestRun.py" and "TestReferences.py", but since the
>>>>>>> performance impact is probably negligible in this case,
>>>>>>> I thought allowing for recursion would help to keep the
>>>>>>> code more accessible. In "Printer.py" there is at least
>>>>>>> one recursion from "printout_update" into "printout".
>>>>>>>
>>>>>>> Best regards, Adam.
>>>>>>>
>>>>>>> Am 06.12.2012 18:19, schrieb Adam Reichold:
>>>>>>>>>> Hello Carlos,
>>>>>>>>>>
>>>>>>>>>> Am 06.12.2012 18:11, schrieb Carlos Garcia
>>>>>>>>>> Campos:
>>>>>>>>>>> Adam Reichold <adamreichold at myopera.com>
>>>>>>>>>>> writes:
>>>>>>>>>>>> Hello again,
>>>>>>>>>>>>
>>>>>>>>>>>> As suggested by William Bader, I added the
>>>>>>>>>>>> option to use the logical CPU count as the
>>>>>>>>>>>> number of worker threads automatically.
>>>>>>>>>>> Thanks for the patch, any improvement in the
>>>>>>>>>>> regression test framework is more than welcome,
>>>>>>>>>>> even more if it's to improve the performance.
>>>>>>>>>>> What poppler have you sued to create the
>>>>>>>>>>> patches? They don't apply for me here using
>>>>>>>>>>> poppler from current git master.
>>>>>>>>>> Yes, it is about performance and I am hesitant
>>>>>>>>>> about introducing such complexity into the test
>>>>>>>>>> driver which is supposed to test the correctness
>>>>>>>>>> of program itself, but faster regression testing
>>>>>>>>>> probably means more regression testing... :-)
>>>>>>>>>>
>>>>>>>>>> I am not sure how to correctly format the patch.
>>>>>>>>>> I built on the current master but it seems Git
>>>>>>>>>> thinks that the lines where I just changed the
>>>>>>>>>> indentation are whitespace errors. I am not sure
>>>>>>>>>> how to correct this as changes in whitespace are
>>>>>>>>>> obviously important for Python scripts... I'll
>>>>>>>>>> try to research this, but any advice would be
>>>>>>>>>> welcome.
>>>>>>>>>>
>>>>>>>>>> Best regards, Adam.
>>>>>>>>>>
>>>>>>>>>>>> Regards, Adam.
>>>>>>>>>>>>
>>>>>>>>>>>> Am 04.12.2012 12:28, schrieb Adam Reichold:
>>>>>>>>>>>>> Hello Thomas,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am 04.12.2012 08:41, schrieb Thomas
>>>>>>>>>>>>> Freitag:
>>>>>>>>>>>>>> Am 04.12.2012 07:45, schrieb Adam
>>>>>>>>>>>>>> Reichold: Hello everyone,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I currently try to get myself acquainted
>>>>>>>>>>>>>> with Poppler's regression testing
>>>>>>>>>>>>>> framework. Because my system has a rather
>>>>>>>>>>>>>> low single-threaded performance, I tried
>>>>>>>>>>>>>> to implement parallel testing using
>>>>>>>>>>>>>> Python's Queue class.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Even though poppler-regtest currently
>>>>>>>>>>>>>> uses two processes per test file,
>>>>>>>>>>>>>> rendering even and odd pages
>>>>>>>>>>>>>> respectively, the test files themselves
>>>>>>>>>>>>>> are still handled sequentially and both
>>>>>>>>>>>>>> process are joined for each test file.
>>>>>>>>>>>>>> This will yield suboptimal system
>>>>>>>>>>>>>> utilization even for a small three-core
>>>>>>>>>>>>>> system like mine.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Using the "-t/--threads N" option in the
>>>>>>>>>>>>>> patched poppler-regtest will spawn N
>>>>>>>>>>>>>> worker threads that handle all tests they
>>>>>>>>>>>>>> can get from a single queue for all known
>>>>>>>>>>>>>> tests, allowing to heavily utilize also
>>>>>>>>>>>>>> large system if using a large set of test
>>>>>>>>>>>>>> cases. But even for my three-core system,
>>>>>>>>>>>>>> this brought down the time to create
>>>>>>>>>>>>>> references for the complete test suite
>>>>>>>>>>>>>> using the Splash backend from 4,5 hours
>>>>>>>>>>>>>> to 2,75 hours.
>>>>>>>>>>>>>>> What do You mean with "patched
>>>>>>>>>>>>>>> poppler-regtest"? There was no
>>>>>>>>>>>>>>> attachment, or do I miss something?
>>>>>>>>>>>>> No, this on the process level in the test
>>>>>>>>>>>>> driver and independent of the your
>>>>>>>>>>>>> multi-threading work. There was a patch
>>>>>>>>>>>>> called "parallel_testing.patch" attached,
>>>>>>>>>>>>> as least my mail client tells me so.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'll just try again...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards, Adam.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If You talk about my multi-threaded
>>>>>>>>>>>>>>> testcase, please be aware, that it is
>>>>>>>>>>>>>>> still experimental, even if quite
>>>>>>>>>>>>>>> close. Here the result of my last
>>>>>>>>>>>>>>> regression test last sunday evening:
>>>>>>>>>>>>>>> Total 1133 tests 1114 tests passed
>>>>>>>>>>>>>>> (98.32%) 17 tests failed (1.50%):
>>>>>>>>>>>>>>> /media/thomas/HD-PCTU3/PDF
>>>>>>>>>>>>>>> Suite/Algorithmics - The Spirit of
>>>>>>>>>>>>>>> Computing, 3rd Ed.pdf (splash),
>>>>>>>>>>>>>>> /media/thomas/HD-PCTU3/PDF
>>>>>>>>>>>>>>> Suite/Essentials of English Grammar -
>>>>>>>>>>>>>>> www.ielts4u.blogfa.com.pdf ... 2 tests
>>>>>>>>>>>>>>> crashed (0.18%):
>>>>>>>>>>>>>>> /media/thomas/HD-PCTU3/PDF
>>>>>>>>>>>>>>> Suite/bug157090.pdf (splash),
>>>>>>>>>>>>>>> /media/thomas/HD-PCTU3/PDF
>>>>>>>>>>>>>>> Suite/sinatr4c.f5.pdf (splash) I'll
>>>>>>>>>>>>>>> first have a look at the crashes next
>>>>>>>>>>>>>>> weekend, then I'll continue with
>>>>>>>>>>>>>>> looking at the failed tests.... Cheers,
>>>>>>>>>>>>>>> Thomas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> IMHO, the necessary changes seem quite
>>>>>>>>>>>>>> small especially since a lot of them are
>>>>>>>>>>>>>> connected to indentation handling. What
>>>>>>>>>>>>>> are your thoughts on the utility and
>>>>>>>>>>>>>> implementation of this?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards, Adam.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
poppler mailing list
>>>>>>>>>>>>>>> poppler at lists.freedesktop.org
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>
_______________________________________________ poppler
>>>>>>>>>>>>>> mailing list
>>>>>>>>>>>>>> poppler at lists.freedesktop.org
>>>>>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>
_______________________________________________ poppler
>>>>>>>>>>>>>> mailing list
>>>>>>>>>>>>>> poppler at lists.freedesktop.org
>>>>>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>
>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>
_______________________________________________ poppler mailing
>>>>>>>>>> list poppler at lists.freedesktop.org
>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>
_______________________________________________ poppler
>>>>>>>> mailing list poppler at lists.freedesktop.org
>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>
>>>>>
>>>>>>>>
- From 5e7461569e3085370326203539fa02de968eb2ca Mon Sep 17 00:00:00
>>>>> 2001 From: Adam Reichold <adamreichold at myopera.com> Date:
>>>>> Sat, 29 Dec 2012 15:17:33 +0100 Subject: [PATCH] print
>>>>> messages that are useful with interleaved output by several
>>>>> worker threads
>>>>>
>>>>> --- regtest/TestRun.py | 18 +++++++++--------- 1 file
>>>>> changed, 9 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/regtest/TestRun.py b/regtest/TestRun.py index
>>>>> 5564757..750b93c 100644 --- a/regtest/TestRun.py +++
>>>>> b/regtest/TestRun.py @@ -71,7 +71,7 @@ class TestRun: with
>>>>> self._lock: self._n_tests += 1
>>>>>
>>>>> - self.printer.print_test_start("Testing '%s' using
>>>>> %s backend (%d/%d): " % (doc_path, backend.get_name(),
>>>>> n_doc, total_docs)) + test_msg = "Testing '%s' using
>>>>> %s backend (%d/%d): " % (doc_path, backend.get_name(),
>>>>> n_doc, total_docs) test_has_md5 =
>>>>> backend.create_refs(doc_path, test_path)
>>>>>
>>>>> if backend.has_stderr(test_path): @@ -81,24 +81,24 @@
>>>>> class TestRun: if ref_has_md5 and test_has_md5: if
>>>>> backend.compare_checksums(refs_path, test_path, not
>>>>> self.config.keep_results, self.config.create_diffs,
>>>>> self.config.update_refs): # FIXME: remove dir if it's
>>>>> empty? - self.printer.print_test_result("PASS") +
>>>>> self.printer.printout(test_msg + "PASS") with self._lock:
>>>>> self._n_passed += 1 else: -
>>>>> self.printer.print_test_result_ln("FAIL") +
>>>>> self.printer.printout_ln(test_msg + "FAIL") with
>>>>> self._lock: self._failed.append("%s (%s)" % (doc_path,
>>>>> backend.get_name())) return elif test_has_md5: if
>>>>> ref_is_crashed: - self.printer.print_test_result_ln("DOES
>>>>> NOT CRASH") + self.printer.printout_ln(test_msg + "DOES NOT
>>>>> CRASH") elif ref_is_failed: -
>>>>> self.printer.print_test_result_ln("DOES NOT FAIL") +
>>>>> self.printer.printout_ln(test_msg + "DOES NOT FAIL")
>>>>> return
>>>>>
>>>>> test_is_crashed = backend.is_crashed(test_path) if
>>>>> ref_is_crashed and test_is_crashed: -
>>>>> self.printer.print_test_result("PASS (Expected crash)") +
>>>>> self.printer.printout_ln(test_msg + "PASS (Expected
>>>>> crash)") with self._lock: self._n_passed += 1 return @@
>>>>> -106,19 +106,19 @@ class TestRun: test_is_failed =
>>>>> backend.is_failed(test_path) if ref_is_failed and
>>>>> test_is_failed: # FIXME: compare status errors -
>>>>> self.printer.print_test_result("PASS (Expected fail with
>>>>> status error %d)" % (test_is_failed)) +
>>>>> self.printer.printout(test_msg + "PASS (Expected fail with
>>>>> status error %d)" % (test_is_failed)) with self._lock:
>>>>> self._n_passed += 1 return
>>>>>
>>>>> if test_is_crashed: -
>>>>> self.printer.print_test_result_ln("CRASH") +
>>>>> self.printer.printout_ln(test_msg + "CRASH") with
>>>>> self._lock: self._crashed.append("%s (%s)" % (doc_path,
>>>>> backend.get_name())) return
>>>>>
>>>>> if test_is_failed: -
>>>>> self.printer.print_test_result_ln("FAIL (status error %d)"
>>>>> % (test_is_failed)) +
>>>>> self.printer.printout_ln(test_msg + "FAIL (status error
>>>>> %d)" % (test_is_failed)) with self._lock:
>>>>> self._failed_status_error("%s (%s)" % (doc_path,
>>>>> backend.get_name())) return -- 1.8.0.3
>>>>>
>>>>> _______________________________________________ poppler
>>>>> mailing list poppler at lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>
>> _______________________________________________ poppler mailing
>> list poppler at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/poppler
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAEBAgAGBQJQ30C+AAoJEPSSjE3STU34zZ4H/2wXadzNy60Qy9t/mqsP+/yL
2DnkY9ix96RdcG8wS1Thp985WlG9F0J0SLMXHWGLZip1FoiIn6gkMCkzHjcycKU5
aiokBX7tI0h9mLV531X2RPRWrTT9cEB+dZuLMZnzsbuKHe072R6+/gTlm8GqOk1/
dJCNLpB4JNAoFWOd2qNZ+LxI2yrrR1HUrD17VlstImd2mGykKUwRua2jfGls9ld4
KwV1UDWDT/gAmrDHm3X38HLTQBXIqhlfgP44A4Q4lOkqMGoPzsHWea6nW2cCwzWo
nBNDwVHD1wNtZC/PYMi4thjgO79beBrhhZD2/rLAtWJ422MJ6e8Gnqya0PSuyPI=
=BXwL
-----END PGP SIGNATURE-----
More information about the poppler
mailing list