[Pixman] Testing (Re: [PATCH 3/3] ARMv6: Add fast path for over_n_8888_8888_ca)
Pekka Paalanen
ppaalanen at gmail.com
Fri Apr 4 00:28:05 PDT 2014
On Fri, 4 Apr 2014 08:24:18 +0300
Siarhei Siamashka <siarhei.siamashka at gmail.com> wrote:
> On Mon, 31 Mar 2014 15:03:45 +0300
> Pekka Paalanen <ppaalanen at gmail.com> wrote:
>
> > From: Ben Avison <bavison at riscosopen.org>
> >
> > Benchmark results, "before" is the patch
> > - ARMv6: Add fast path for over_reverse_n_8888,
> > "after" contains the additional patches:
> > - ARM: share pixman_asm_function definition
> > - ARMv6: Support for very variable-hungry composite operations
> > - ARMv6: Add fast path for over_n_8888_8888_ca (this patch)
> >
> > lowlevel-blt-bench, over_n_8888_8888_ca, 100 iterations:
> >
> > Before After
> > Mean StdDev Mean StdDev Confidence Change
> > L1 2.7 0.0 16.0 0.0 100.00% +495.0%
> > L2 2.4 0.0 14.3 0.2 100.00% +497.7%
> > M 2.3 0.0 14.8 0.0 100.00% +528.6%
> > HT 2.2 0.0 9.6 0.0 100.00% +341.4%
> > VT 2.2 0.0 9.4 0.0 100.00% +331.7%
> > R 2.2 0.0 9.4 0.0 100.00% +327.3%
> > RT 1.9 0.0 5.3 0.1 100.00% +181.5%
> >
> > At most 3 outliers rejected per case per set.
> >
> > cairo-perf-trace with trimmed traces, 30 iterations:
> >
> > Before After
> > Mean StdDev Mean StdDev Confidence Change
> > t-firefox-talos-gfx.trace 32.9 0.4 25.4 0.4 100.00% +29.6%
> > t-firefox-scrolling.trace 31.2 0.1 24.6 0.1 100.00% +26.7%
> > t-gnome-terminal-vim.trace 22.2 0.1 19.8 0.2 100.00% +11.7%
> > t-firefox-planet-gnome.trace 11.5 0.0 10.9 0.0 100.00% +6.4%
> > t-evolution.trace 13.8 0.1 13.0 0.1 100.00% +5.9%
> > t-gvim.trace 33.5 0.2 33.0 0.2 100.00% +1.3%
> > t-xfce4-terminal-a1.trace 4.8 0.0 4.8 0.0 100.00% +1.1%
> > t-poppler-reseau.trace 22.4 0.1 22.1 0.1 100.00% +1.0%
> > t-firefox-talos-svg.trace 20.5 0.1 20.4 0.0 100.00% +0.7%
> > t-gnome-system-monitor.trace 17.2 0.0 17.1 0.0 100.00% +0.6%
> > t-swfdec-giant-steps.trace 14.9 0.0 14.8 0.0 100.00% +0.6%
> > t-midori-zoomed.trace 8.0 0.0 8.0 0.0 100.00% +0.5%
> > t-firefox-paintball.trace 18.0 0.0 17.9 0.0 100.00% +0.5%
> > t-firefox-canvas.trace 18.0 0.0 17.9 0.0 100.00% +0.3%
> > t-firefox-asteroids.trace 11.1 0.0 11.1 0.0 100.00% +0.3%
> > t-firefox-fishbowl.trace 21.2 0.0 21.1 0.0 100.00% +0.3%
> > t-chromium-tabs.trace 4.9 0.0 4.9 0.0 95.59% +0.3% (insignificant)
> > t-poppler.trace 9.7 0.0 9.7 0.1 92.48% +0.2% (insignificant)
> > t-firefox-canvas-swscroll.trace 32.1 0.1 32.1 0.1 76.28% +0.1% (insignificant)
> > t-firefox-fishtank.trace 13.2 0.0 13.2 0.0 82.91% +0.0% (insignificant)
> > t-swfdec-youtube.trace 7.8 0.0 7.8 0.0 16.82% +0.0% (insignificant)
> > t-firefox-chalkboard.trace 36.6 0.0 36.6 0.0 100.00% -0.1%
> > t-grads-heat-map.trace 4.4 0.0 4.4 0.0 99.95% -0.6%
> > t-firefox-particles.trace 27.3 0.2 27.5 0.1 100.00% -0.6%
> > t-firefox-canvas-alpha.trace 20.5 0.3 20.7 0.3 97.72% -0.8% (insignificant)
> >
> > At most 6 outliers rejected per case per set.
> >
> > Cairo perf reports the running time, but the change is computed for
> > operations per second instead (inverse of running time).
> >
> > Confidence is based on Welch's t-test. Absolute changes less than 1%
> > can be accounted as measurement errors, even if statistically
> > significant.
> >
> > v4, Pekka Paalanen <pekka.paalanen at collabora.co.uk> :
> > Use pixman_asm_function instead of startfunc.
> > Rebased. Re-benchmarked on Raspberry Pi.
> > Commit message.
>
> Appears that this code fails the 'blitters-test' if it is run a bit
> longer than the default use of it in 'make check':
>
>
> ./fuzzer-find-diff.pl ./blitters-test.generic ./blitters-test.armv6
>
> [...]
>
> op=PIXMAN_OP_OVER
> src_fmt=r5g6b5, dst_fmt=a8r8g8b8, mask_fmt=a8r8g8b8
> src_width=1, src_height=1, dst_width=235, dst_height=12
> src_x=0, src_y=0, dst_x=49, dst_y=7
> src_stride=12, dst_stride=940
> w=185, h=1
>
> [...]
>
> The problematic conditions can be reproduced by running:
> ./blitters-test 4928372
>
>
> And this is not the first time we miss a bug because of running
> blitters-test just a little bit shorter than would be necessary
> to detect it:
> http://lists.freedesktop.org/archives/pixman/2013-March/002700.html
> And we had a similar near miss in a couple of other cases. So let's
> increase the loop counter in it from 2000000 -> 10000000 this time for
> real. The downside is that 'make check' is going to run longer
> (especially on the Raspberry Pi or MIPS32). But on my desktop PC it
> changes the time spent on running this particular test from ~3.6s
> to ~23s.
>
> As this patch can't go in yet, we also put "ARMv6: Support for very
> variable-hungry composite operations" on hold for a bit.
>
Hi,
thank you for pushing the two patches, and running extended tests. I
will check with Ben on what to do here.
Could someone point me to a document describing how one uses these
testing tools properly? Hopefully it would answer all my questions
below.
In the pixman test directory on the rpi, I issued
$ ./fuzzer-find-diff.pl ./blitters-test.generic ./blitters-test.armv6 10000000
Success: 10000000 tests finished
And also without the '10000000' argument, I waited for a lot longer (a
few minutes), and it never indicated an error.
Should I somehow manually create the binaries blitters-test.generic and
blitters-test.armv6 before running that command? Right now, there are
no files with those names anywhere, so I was a bit surprised it ran
just fine.
I also couldn't figure out, how does 'make check' do this comparison,
if it is supposed to have those two binaries built separately somehow?
OTOH, I see that fuzzer_test_main() takes a checksum as an argument.
How do you determine what the correct checksum should be?
After reading the big comment on fuzzer_test_main() and the usage of
fuzzer-find-diff.pl, I'm getting the hunch, that the procedure would be
something like this:
- compile pixman without optimizations producing statically linked
blitter-test (how?), rename it to blitters-test.generic
- compile pixman with optimizations producing statically linked
blitter-test (how?), rename it to blitters-test.armv6 (on rpi)
- run with fuzzer-find-diff.pl for the predetermined number of rounds
- if no differences found, take the final checksum (from where?) and
hardcode it in the fuzzer_test_main() call.
And normally that would be done only by the maintainers, or when
someone adds a new fuzzer test. Is that right?
For building the generic version, or whatever version is the gold
standard, should I use all the --disable switches mentioned
in ./configure --help?
And 'make check' only runs the whatever was built and checks just
against the hardcoded checksum?
Thanks,
pq
More information about the Pixman
mailing list