[poppler] RFC: patch: parallel testing
Carlos Garcia Campos
carlosgc at gnome.org
Sat Dec 29 08:13:53 PST 2012
Adam Reichold <adamreichold at myopera.com> writes:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello Thomas,
>
> It is not a very sophisticated solution, but I think the attached
> patch should ensure useful messages for interleaved output of several
> worker threads.
I hadn't noticed this problem, and I can only reproduce it when running
the tests in verbose mode, does it only happen when stdout is not a tty?
> Best regards, Adam.
>
> Am 29.12.2012 10:32, schrieb Thomas Freitag:
>> Hi Adam!
>>
>> I used yesterday the first time Your patch for regtesting: Awfully,
>> congratulations. You probably already encountered my comment on bug
>> 50992. One minor thing: Unfortunately the protocol is not really
>> readable anymore, because each thread i.e. first outputs
>>
>> Testing '/media/thomas/HD-PCTU3/PDF Suite/01_ump_a_2009.pdf' using
>> splash backend (9/1137):
>>
>> and when it finishes i.e.
>>
>> PASS
>>
>> Because of the worker threads it's now nearly impossible to match
>> the test itself with the result of the test. Can You change it in
>> that way that this is outputted together? Should be quite easy....
>>
>> Cheers, Thomas
>>
>> Am 06.12.2012 19:02, schrieb Adam Reichold: Hello again,
>>
>> I reformatted the patch hopefully solving the whitespace issues.
>> (It does apply and run for me, but there are still some whitespace
>> warnings.)
>>
>> The use of recursive locks is probably not necessary in
>> "TestRun.py" and "TestReferences.py", but since the performance
>> impact is probably negligible in this case, I thought allowing for
>> recursion would help to keep the code more accessible. In
>> "Printer.py" there is at least one recursion from
>> "printout_update" into "printout".
>>
>> Best regards, Adam.
>>
>> Am 06.12.2012 18:19, schrieb Adam Reichold:
>>>>> Hello Carlos,
>>>>>
>>>>> Am 06.12.2012 18:11, schrieb Carlos Garcia Campos:
>>>>>> Adam Reichold <adamreichold at myopera.com> writes:
>>>>>>> Hello again,
>>>>>>>
>>>>>>> As suggested by William Bader, I added the option to use
>>>>>>> the logical CPU count as the number of worker threads
>>>>>>> automatically.
>>>>>> Thanks for the patch, any improvement in the regression
>>>>>> test framework is more than welcome, even more if it's to
>>>>>> improve the performance. What poppler have you sued to
>>>>>> create the patches? They don't apply for me here using
>>>>>> poppler from current git master.
>>>>> Yes, it is about performance and I am hesitant about
>>>>> introducing such complexity into the test driver which is
>>>>> supposed to test the correctness of program itself, but
>>>>> faster regression testing probably means more regression
>>>>> testing... :-)
>>>>>
>>>>> I am not sure how to correctly format the patch. I built on
>>>>> the current master but it seems Git thinks that the lines
>>>>> where I just changed the indentation are whitespace errors.
>>>>> I am not sure how to correct this as changes in whitespace
>>>>> are obviously important for Python scripts... I'll try to
>>>>> research this, but any advice would be welcome.
>>>>>
>>>>> Best regards, Adam.
>>>>>
>>>>>>> Regards, Adam.
>>>>>>>
>>>>>>> Am 04.12.2012 12:28, schrieb Adam Reichold:
>>>>>>>> Hello Thomas,
>>>>>>>>
>>>>>>>> Am 04.12.2012 08:41, schrieb Thomas Freitag:
>>>>>>>>> Am 04.12.2012 07:45, schrieb Adam Reichold: Hello
>>>>>>>>> everyone,
>>>>>>>>>
>>>>>>>>> I currently try to get myself acquainted with
>>>>>>>>> Poppler's regression testing framework. Because my
>>>>>>>>> system has a rather low single-threaded performance,
>>>>>>>>> I tried to implement parallel testing using Python's
>>>>>>>>> Queue class.
>>>>>>>>>
>>>>>>>>> Even though poppler-regtest currently uses two
>>>>>>>>> processes per test file, rendering even and odd
>>>>>>>>> pages respectively, the test files themselves are
>>>>>>>>> still handled sequentially and both process are
>>>>>>>>> joined for each test file. This will yield suboptimal
>>>>>>>>> system utilization even for a small three-core system
>>>>>>>>> like mine.
>>>>>>>>>
>>>>>>>>> Using the "-t/--threads N" option in the patched
>>>>>>>>> poppler-regtest will spawn N worker threads that
>>>>>>>>> handle all tests they can get from a single queue
>>>>>>>>> for all known tests, allowing to heavily utilize
>>>>>>>>> also large system if using a large set of test cases.
>>>>>>>>> But even for my three-core system, this brought down
>>>>>>>>> the time to create references for the complete test
>>>>>>>>> suite using the Splash backend from 4,5 hours to
>>>>>>>>> 2,75 hours.
>>>>>>>>>> What do You mean with "patched poppler-regtest"?
>>>>>>>>>> There was no attachment, or do I miss something?
>>>>>>>> No, this on the process level in the test driver and
>>>>>>>> independent of the your multi-threading work. There
>>>>>>>> was a patch called "parallel_testing.patch" attached,
>>>>>>>> as least my mail client tells me so.
>>>>>>>>
>>>>>>>> I'll just try again...
>>>>>>>>
>>>>>>>> Best regards, Adam.
>>>>>>>>
>>>>>>>>>> If You talk about my multi-threaded testcase,
>>>>>>>>>> please be aware, that it is still experimental,
>>>>>>>>>> even if quite close. Here the result of my last
>>>>>>>>>> regression test last sunday evening: Total 1133
>>>>>>>>>> tests 1114 tests passed (98.32%) 17 tests failed
>>>>>>>>>> (1.50%): /media/thomas/HD-PCTU3/PDF
>>>>>>>>>> Suite/Algorithmics - The Spirit of Computing, 3rd
>>>>>>>>>> Ed.pdf (splash), /media/thomas/HD-PCTU3/PDF
>>>>>>>>>> Suite/Essentials of English Grammar -
>>>>>>>>>> www.ielts4u.blogfa.com.pdf ... 2 tests crashed
>>>>>>>>>> (0.18%): /media/thomas/HD-PCTU3/PDF
>>>>>>>>>> Suite/bug157090.pdf (splash),
>>>>>>>>>> /media/thomas/HD-PCTU3/PDF Suite/sinatr4c.f5.pdf
>>>>>>>>>> (splash) I'll first have a look at the crashes
>>>>>>>>>> next weekend, then I'll continue with looking at
>>>>>>>>>> the failed tests.... Cheers, Thomas
>>>>>>>>>
>>>>>>>>> IMHO, the necessary changes seem quite small
>>>>>>>>> especially since a lot of them are connected to
>>>>>>>>> indentation handling. What are your thoughts on the
>>>>>>>>> utility and implementation of this?
>>>>>>>>>
>>>>>>>>> Best regards, Adam.
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> poppler mailing list poppler at lists.freedesktop.org
>>>>>>>>>>
>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
> _______________________________________________ poppler
>>>>>>>>> mailing list poppler at lists.freedesktop.org
>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
> _______________________________________________ poppler
>>>>>>>>> mailing list poppler at lists.freedesktop.org
>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>
>>>>>
>>>>>>>>>
>>>>>>>>>
> _______________________________________________ poppler mailing
>>>>> list poppler at lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>
>>>>>
>>>
>>>
>>> _______________________________________________ poppler mailing
>>> list poppler at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.19 (GNU/Linux)
>
> iQEcBAEBAgAGBQJQ3vyQAAoJEPSSjE3STU34/9kH/iOG+N6WLsZKKKYeKtxEOtda
> TMT1s6SsiWCEQIHIZNMoJvpHAO75mDXU3M+/QXuYre6Bhn+1+PlsVI/S74wus9HI
> 8jJC0sDOBYvM3PrEng+kjMsff/iOH6hsMrPbL/R/qAB4qjdCQSeAvloYu4TaSeB4
> 6dyqoZ05qxUt5UMEtVqDQLGHMllL2fwOMy3//elpzCiDArR/okv51p9r+qiEbQAe
> 7a+xpRplFSbQs0AvNqlE1GFFGVRusckzH6aXLng//DiCbJ+8iyGqZ5ALqZyk/HeH
> fRb4e+68siIRf+NvpUktNJCfmvjqfF47yaW+e3JAxTm848sHtmEqHqeEyT/wL98=
> =CKSU
> -----END PGP SIGNATURE-----
> From 5e7461569e3085370326203539fa02de968eb2ca Mon Sep 17 00:00:00 2001
> From: Adam Reichold <adamreichold at myopera.com>
> Date: Sat, 29 Dec 2012 15:17:33 +0100
> Subject: [PATCH] print messages that are useful with interleaved output by
> several worker threads
>
> ---
> regtest/TestRun.py | 18 +++++++++---------
> 1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/regtest/TestRun.py b/regtest/TestRun.py
> index 5564757..750b93c 100644
> --- a/regtest/TestRun.py
> +++ b/regtest/TestRun.py
> @@ -71,7 +71,7 @@ class TestRun:
> with self._lock:
> self._n_tests += 1
>
> - self.printer.print_test_start("Testing '%s' using %s backend (%d/%d): " % (doc_path, backend.get_name(), n_doc, total_docs))
> + test_msg = "Testing '%s' using %s backend (%d/%d): " % (doc_path, backend.get_name(), n_doc, total_docs)
> test_has_md5 = backend.create_refs(doc_path, test_path)
>
> if backend.has_stderr(test_path):
> @@ -81,24 +81,24 @@ class TestRun:
> if ref_has_md5 and test_has_md5:
> if backend.compare_checksums(refs_path, test_path, not self.config.keep_results, self.config.create_diffs, self.config.update_refs):
> # FIXME: remove dir if it's empty?
> - self.printer.print_test_result("PASS")
> + self.printer.printout(test_msg + "PASS")
> with self._lock:
> self._n_passed += 1
> else:
> - self.printer.print_test_result_ln("FAIL")
> + self.printer.printout_ln(test_msg + "FAIL")
> with self._lock:
> self._failed.append("%s (%s)" % (doc_path, backend.get_name()))
> return
> elif test_has_md5:
> if ref_is_crashed:
> - self.printer.print_test_result_ln("DOES NOT CRASH")
> + self.printer.printout_ln(test_msg + "DOES NOT CRASH")
> elif ref_is_failed:
> - self.printer.print_test_result_ln("DOES NOT FAIL")
> + self.printer.printout_ln(test_msg + "DOES NOT FAIL")
> return
>
> test_is_crashed = backend.is_crashed(test_path)
> if ref_is_crashed and test_is_crashed:
> - self.printer.print_test_result("PASS (Expected crash)")
> + self.printer.printout_ln(test_msg + "PASS (Expected crash)")
> with self._lock:
> self._n_passed += 1
> return
> @@ -106,19 +106,19 @@ class TestRun:
> test_is_failed = backend.is_failed(test_path)
> if ref_is_failed and test_is_failed:
> # FIXME: compare status errors
> - self.printer.print_test_result("PASS (Expected fail with status error %d)" % (test_is_failed))
> + self.printer.printout(test_msg + "PASS (Expected fail with status error %d)" % (test_is_failed))
> with self._lock:
> self._n_passed += 1
> return
>
> if test_is_crashed:
> - self.printer.print_test_result_ln("CRASH")
> + self.printer.printout_ln(test_msg + "CRASH")
> with self._lock:
> self._crashed.append("%s (%s)" % (doc_path, backend.get_name()))
> return
>
> if test_is_failed:
> - self.printer.print_test_result_ln("FAIL (status error %d)" % (test_is_failed))
> + self.printer.printout_ln(test_msg + "FAIL (status error %d)" % (test_is_failed))
> with self._lock:
> self._failed_status_error("%s (%s)" % (doc_path, backend.get_name()))
> return
> --
> 1.8.0.3
>
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
--
Carlos Garcia Campos
PGP key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x523E6462
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20121229/451fc14c/attachment-0001.pgp>
More information about the poppler
mailing list