[poppler] [RFC] Extend regtest framework to track performance

Adam Reichold adam.reichold at t-online.de
Wed Dec 30 16:03:53 PST 2015


Hello,

Am 30.12.2015 um 19:17 schrieb Carlos Garcia Campos:
> Albert Astals Cid <aacid at kde.org> writes:
> 
>> El Wednesday 30 December 2015, a les 17:04:42, Adam Reichold va escriure:
>>> Hello again,
>>>
>>> as discussed in the code modernization thread, if we are going to make
>>> performance-orient changes, we need a simple way to track functional and
>>> performance regressions.
>>>
>>> The attached patch tries to extend the existing Python-based regtest
>>> framework to measure run time and memory usage to spot significant
>>> performance changes in the sense of relative deviations w.r.t. to these
>>> two parameters. It also collects the sums of both which might be used as
>>> "ball park" numbers to compare the performance effect of changes over
>>> document collections.
>>
>> Have you tried it? How stable are the numbers? For example here i get for 
>> rendering the same file (discarding the first time that is loading the file 
>> into memory) numbers that range from 620ms to 676ms, i.e. ~10% variation 
>> without no change at all.
> 
> I haven't looked at the patches in detail yet, but I don't think the
> regtest framework should be the one measuring. I would add a tool for
> that, pdfperf or something like that, that could measure the internals
> (parsing, output devs, etc.) in a more detailed way. If we
> need to get more information from the poppler core we could just add a
> compile option to provide that info. And the regtest framework should
> just run the perf command and collect the results. A report command
> could compare results with refs or previous executions. I also think
> performance tests should be run with a different command, since we don't
> want to measure the performance of every single document we have in our
> test suite (it would take too long). Instead, we could select a set of
> documents with different kind of pdf features to run perf tests on those
> only.
> 
> So, in summary, I would use a dedicated tool (depending on a build
> option if we need to get more info from the core), and maybe even a
> dedicated perf framework on top of that tool, since I consider perf tests
> different from rendering regression tests, the same way unit tests are
> handled by a different framework too.

I agree that a dedicated tool might provide more detailed information.
But with the limited resources we have, even some information might be
useful. Of course we should make it reliable, e.g. by improving upon the
measurement procedure.

Also users probably won't care about which part of the library did
produce a performance regression, so the overall numbers are indeed
interesting IMHO. Especially since a developer can always do proper
profiling when looking into a specific regression. Microbenchmarks, e.g.
using QTest, also provide a certain balance w.r.t. these issues, as they
can be used to continuously observe the performance of specific portions
of the code base with more or less the same overhead as a unit test.

Best regards, Adam.

>> Cheers,
>>   Albert
>>
>>>
>>> The patch runs the measured commands repeatedly including warm-up
>>> iterations and collects statistics from these runs. The measurement
>>> results are stored as JSON documents with the actual program output of
>>> e.g. pdftotext or pdftoppm being discarded.
>>>
>>> To implement the check for relative deviations, it abuses the checksum
>>> comparison method and hence checksums are still computed for the JSON
>>> documents even though they are actually unnecessary. It is also limited
>>> to Unix-like operating systems (due to the use of the wait3 syscall to
>>> determine resource usage similar to the time command).
>>
>> _______________________________________________
>> poppler mailing list
>> poppler at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/poppler
> 
> 
> 
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20151231/61fe576a/attachment.sig>


More information about the poppler mailing list