[Pixman] Testing (Re: [PATCH 3/3] ARMv6: Add fast path for over_n_8888_8888_ca)

Mon Apr 7 00:50:03 PDT 2014

On Fri, 4 Apr 2014 10:28:05 +0300
Pekka Paalanen <ppaalanen at gmail.com> wrote:

> Hi,
> 
> thank you for pushing the two patches, and running extended tests. I
> will check with Ben on what to do here.
> 
> Could someone point me to a document describing how one uses these
> testing tools properly? Hopefully it would answer all my questions
> below.

Unfortunately the only basic document describing how to use this
particular testing tool is printed if you run the fuzzer-find-diff.pl
script without any arguments. And also a comment in the code for the
fuzzer_test_main() function:
    http://cgit.freedesktop.org/pixman/tree/test/utils.c?id=pixman-0.32.4#n670
But you have found all of this already and I have no additional links
or documents to share. Sorry about this. Though google search may also
have some hits in the pixman mailing list.

The documentation clearly needs improvements. Your feedback is valuable
and helps to identify the gaps in it.

> In the pixman test directory on the rpi, I issued
> $ ./fuzzer-find-diff.pl ./blitters-test.generic ./blitters-test.armv6 10000000
> Success: 10000000 tests finished
> 
> And also without the '10000000' argument, I waited for a lot longer (a
> few minutes), and it never indicated an error.
> 
> Should I somehow manually create the binaries blitters-test.generic and
> blitters-test.armv6 before running that command? Right now, there are
> no files with those names anywhere, so I was a bit surprised it ran
> just fine.

It compared the results of trying to run one non-existing program with
the results of running another non-existing program. No difference is
found because they produce identical output to stdout (fail in the same
way).

This surely can be improved to be more foolproof to handle the special
case of trying to execute something that does not exist.

> I also couldn't figure out, how does 'make check' do this comparison,

The tests based on fuzzer_test_main() run a batch of subtests, which do
pseudo-random composite operations on images. The outcome of each
subtest (a 32-bit checksum) is deterministic and only depends on its
seed for the pseudo-random number generator. The outcome of the
fuzzer_test_main() itself is a 32-bit checksum, which depends on the
range of the seeds that are tested.

Now what we have is just a checksum number in the end. Because a large
number of different pseudo-random operations explore a lot of different
code paths in pixman, this checksum is reasonably sensitive to the
changes in the pixman behaviour.

There are two ways to use this checksum. One is used as part of the
'make check' run. We just try seeds from 1 to 2000000 in blitters-test
and hardcode the expected checksum there. If the calculated checksum is
the same as expected, then the test passes. Super simple! But it does
not tell us much about why exactly it failed and what has changed.

So another use of it is to prepare two fuzzer_test_main() based test
binaries, which are supposed to work exactly the same (except for the
performance differences). Now if we run these binaries to calculate
checksums for different ranges of seeds, then we expect the same
checksums from both binaries. If we have 'make check' failure in
blitters-test, then we know that at least one seed in the range from 1
to 2000000 is causing a difference in the final checksum. We only need
to identify it, and fuzzer-find-diff.pl script can do this.

The case of over_n_8888_8888_ca failure is a bit special. The problem is
that just trying the seeds from 1 to 2000000 in the blitters-test does
not provide enough coverage to catch all the bugs. That's why it was
missed by 'make check'.

> if it is supposed to have those two binaries built separately somehow?

Not as part of 'make check'. But yes, two binaries are used by
the fuzzer-find-diff.pl script.

> OTOH, I see that fuzzer_test_main() takes a checksum as an argument.
> How do you determine what the correct checksum should be?
> After reading the big comment on fuzzer_test_main() and the usage of
> fuzzer-find-diff.pl, I'm getting the hunch, that the procedure would be
> something like this:

We just assume that the current pixman code is correct and run the
test. It naturally fails, but reports something like this:
"expected XXXXXXXX, got YYYYYYYY". Then we take this YYYYYYYY checksum
and hardcode it in the sources of the test. The assumption is that this
test is going to still produce YYYYYYYY checksum on any platform
regardless of what optimized fast paths they have or don't have.

Please note, that this is only one type of the tests in 'make check'.
This approach does not really work well for floating point fast paths
because we can't expect deterministic pixel perfect results. The other
types of tests exist too.

Anyway, what you are describing below is just the procedure for
narrowing the test failure to a single problematic seed using the
fuzzer-find-diff.pl script:

> - compile pixman without optimizations producing statically linked
>   blitter-test (how?), rename it to blitters-test.generic

Yes, you just use "--disable-shared" option for pixman configure.
That's also a hint given by help message in the fuzzer-find-diff.pl
script.

Or even "--enable-static-testprogs --disable-gtk --disable-libpng" to
statically link everything including libc. This may be useful if you
want to run this binary in android or with qemu-user.

> - compile pixman with optimizations producing statically linked
>   blitter-test (how?), rename it to blitters-test.armv6 (on rpi)

You compile one binary configured as "--disable-arm-simd". And another
one with arm-simd (armv6) optimizations still in place.

If you run the test on a high-end ARM hardware, "--disable-arm-neon"
configure option is also needed to prevent the NEON fast paths from
getting in the way.

Also in your case of having extremely slow Raspberry Pi hardware, it
may be even beneficial to run the reference binary on your x86 box.
The fuzzer-find-diff.pl script contains an example of making use of
ssh to run binaries on different machines.

> - run with fuzzer-find-diff.pl for the predetermined number of rounds
> - if no differences found, take the final checksum (from where?) and
>   hardcode it in the fuzzer_test_main() call.

This is not needed. At least not for anything related to 'make check'.

> And normally that would be done only by the maintainers, or when
> someone adds a new fuzzer test. Is that right?

Yes. If the behaviour of pixman changes (for a good reason and in the
same way on all platforms), then the checksums are updated for these
tests.

> For building the generic version, or whatever version is the gold
> standard, should I use all the --disable switches mentioned
> in ./configure --help?

Only "--disable-arm-simd"/"--disable-arm-neon" options are important
for ARM here. But adding extra --disable switches will not do any harm
either.

> And 'make check' only runs the whatever was built and checks just
> against the hardcoded checksum?

Yes.

-- 
Best regards,
Siarhei Siamashka