[poppler] [RFC] Extend regtest framework to track performance

Thu Dec 31 03:55:09 PST 2015

Adam Reichold <adam.reichold at t-online.de> writes:

> Hello,
>
> Am 30.12.2015 um 19:17 schrieb Carlos Garcia Campos:
>> Albert Astals Cid <aacid at kde.org> writes:
>> 
>>> El Wednesday 30 December 2015, a les 17:04:42, Adam Reichold va escriure:
>>>> Hello again,
>>>>
>>>> as discussed in the code modernization thread, if we are going to make
>>>> performance-orient changes, we need a simple way to track functional and
>>>> performance regressions.
>>>>
>>>> The attached patch tries to extend the existing Python-based regtest
>>>> framework to measure run time and memory usage to spot significant
>>>> performance changes in the sense of relative deviations w.r.t. to these
>>>> two parameters. It also collects the sums of both which might be used as
>>>> "ball park" numbers to compare the performance effect of changes over
>>>> document collections.
>>>
>>> Have you tried it? How stable are the numbers? For example here i get for 
>>> rendering the same file (discarding the first time that is loading the file 
>>> into memory) numbers that range from 620ms to 676ms, i.e. ~10% variation 
>>> without no change at all.
>> 
>> I haven't looked at the patches in detail yet, but I don't think the
>> regtest framework should be the one measuring. I would add a tool for
>> that, pdfperf or something like that, that could measure the internals
>> (parsing, output devs, etc.) in a more detailed way. If we
>> need to get more information from the poppler core we could just add a
>> compile option to provide that info. And the regtest framework should
>> just run the perf command and collect the results. A report command
>> could compare results with refs or previous executions. I also think
>> performance tests should be run with a different command, since we don't
>> want to measure the performance of every single document we have in our
>> test suite (it would take too long). Instead, we could select a set of
>> documents with different kind of pdf features to run perf tests on those
>> only.
>> 
>> So, in summary, I would use a dedicated tool (depending on a build
>> option if we need to get more info from the core), and maybe even a
>> dedicated perf framework on top of that tool, since I consider perf tests
>> different from rendering regression tests, the same way unit tests are
>> handled by a different framework too.
>
> I agree that a dedicated tool might provide more detailed information.
> But with the limited resources we have, even some information might be
> useful. Of course we should make it reliable, e.g. by improving upon the
> measurement procedure.

Yes, you are right. We can probably start with something like this. I
think there's an intermediate solution, though. I think we could at
least measure times per page, since there are documents with lots of
pages, and sometimes it's one page containing a complex pattern or image
the one causing the regression on the whole document. Measuring every
page makes it easier to track those regressions but also provides more
accurate measurements than whole document times. For that we would
probably need to change the tools (pdftoppm/cairo/text) to provide
rendering times per page (using a command line switch, for example).

I'm fine with extending the regtest framework, but what we don't want
for sure is mixing performance and rendering tests. Our current test
suite takes a long time, and also I don't think we should test
performance in the same way. So, I would add a different subcommand
run-perf-test, for example, to be able to run a different subset of
tests and storing the results differently (using a json file or
whatever, without affecting the rendering checksums). I'm not sure
having references like for regtests is the best approach. In this case I
would just keep the results of every revision. And a different
subcommand could be implemented to compare results, producing the report
with the improvements/regressions.

What do you think?

> Also users probably won't care about which part of the library did
> produce a performance regression, so the overall numbers are indeed
> interesting IMHO.

This is not for users, this is for us :-) and we need to know which part
of the code introduced the regression. I agree we can start with a
simpler approach, but at least knowing if the problem is in a particular
page or all pages regressed would help a lot to identify the problem.

> Especially since a developer can always do proper
> profiling when looking into a specific regression. Microbenchmarks, e.g.
> using QTest, also provide a certain balance w.r.t. these issues, as they
> can be used to continuously observe the performance of specific portions
> of the code base with more or less the same overhead as a unit test.
>
> Best regards, Adam.
>
>>> Cheers,
>>>   Albert
>>>
>>>>
>>>> The patch runs the measured commands repeatedly including warm-up
>>>> iterations and collects statistics from these runs. The measurement
>>>> results are stored as JSON documents with the actual program output of
>>>> e.g. pdftotext or pdftoppm being discarded.
>>>>
>>>> To implement the check for relative deviations, it abuses the checksum
>>>> comparison method and hence checksums are still computed for the JSON
>>>> documents even though they are actually unnecessary. It is also limited
>>>> to Unix-like operating systems (due to the use of the wait3 syscall to
>>>> determine resource usage similar to the time command).
>>>
>>> _______________________________________________
>>> poppler mailing list
>>> poppler at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>> 
>> 
>> 
>> _______________________________________________
>> poppler mailing list
>> poppler at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/poppler
>> 
>
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler

-- 
Carlos Garcia Campos
PGP key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x523E6462
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 180 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20151231/1bf20cc0/attachment.sig>