[Pixman] [PATCH] MIPS: DSPr2: Fix bug in over_n_8888_8888_ca/over_n_8888_0565_ca routines
nemanja.lukic at rt-rk.com
Mon Mar 4 01:58:42 PST 2013
> Are you referring to MIPS implementation of the following code?
> Looks like a lot of changes for only adding a missing shift. Are you
> really just fixing a single bug and not also introducing something
Yes, it really does look like a huge change for couple of missing shifts.
When I wrote this code in the first place, I misplaced those shifts, which
allowed me to combine code for over operation and:
UN8x4_MUL_UN8x4 (s, ma);
UN8x4_MUL_UN8 (ma, srca);
ma = ~ma;
UN8x4_MUL_UN8x4_ADD_UN8x4 (d, ma, s);
where shifts are not present (for ma). So I decided to rewrite that piece of
from scratch. I changed logic, so now assembly code mimic code from
but process two pixels at a time. This code should be easier to debug and
> Also appears that this is not the only problem in the MIPS DSPr2
> code. Using "test/fuzzer-find-diff.pl" script, I can reproduce one
> more failure:
I'll look into this, and upload separate patch with fix for this.
From: Siarhei Siamashka [mailto:siarhei.siamashka at gmail.com]
Sent: Sunday, March 03, 2013 3:42 AM
To: Nemanja Lukic
Cc: pixman at lists.freedesktop.org; Nemanja Lukic
Subject: Re: [Pixman] [PATCH] MIPS: DSPr2: Fix bug in
On Fri, 1 Mar 2013 10:10:16 +0100
Nemanja Lukic <nlukic at mips.com> wrote:
> From: Nemanja Lukic <nemanja.lukic at rt-rk.com>
> After introducing new PRNG (pseudorandom number generator) a bug in two
> routines was revealed. Bug manifested by wrong calculation in composite
> glyph tests, which caused make check to fail for MIPS DSPr2 optimizations.
Thanks for spotting and addressing this issue. The test suite has a
relatively good coverage, but a chance for missing some bugs always
exists. Increasing the number of iterations in random tests can
reduce this probability, but makes the tests run longer.
One of the reasons for introducing the new PRNG was the intention to
improve performance. It was the biggest performance bottleneck for
the blitters-test program (another bottleneck is CRC32 calculation
there) as can be seen from the profiling logs included in the
Having less than 4 seconds to run blitters-test on a reasonably fast
x86 hardware, we could increase the number of randomly tested
compositing operations quite significantly and improve reliability.
But unfortunately some slow hardware such as MIPS and ARM11 is
holding us back.
The MIPS build of blitters-test needs ~7.5 minutes to run on MIPS 74K
480MHz hardware or ~2.5 minutes in QEMU on Intel Core i7 860 (with a
single thread). That's quite a lot.
CRC32 can be still improved quite significantly by using a better
split-by-four or split-by-eight implementation from zlib or xz (either
by borrowing the code or by adding one of these libraries as an
optional dependency). But that's more like just tens percents of
overall performance improvement and not a magic solution.
> Bug was in the calculation of the:
> *dst = over (src, *dst) when ma == 0xffffffff
Are you referring to MIPS implementation of the following code?
Indeed, in order to test this branch in the blitters-test program, we
need to encounter four consecutive 0xff bytes in the mask. The randomly
generated images are already having more 0xff and 0x00 bytes. But
maybe adding some code to increase the probability of getting large
clusters of 0xff and 0x00 in the randomly generated images could
improve the reliability of testing.
> In this case src was not negated and shifted right by 24 bits,
> it was only negated. Routines are rewritten, and now make check passes
> for DPSr2 optimizations. Performance improvement remained the same as in
> original commit.
> The bug was revealed in commit b31a6962. Errors were detected by composite
> and glyph tests.
> pixman/pixman-mips-dspr2-asm.S | 298
> 1 files changed, 168 insertions(+), 130 deletions(-)
Looks like a lot of changes for only adding a missing shift. Are you
really just fixing a single bug and not also introducing something
Also appears that this is not the only problem in the MIPS DSPr2
code. Using "test/fuzzer-find-diff.pl" script, I can reproduce one
src_fmt=a8r8g8b8, dst_fmt=a8, mask_fmt=null
src_width=1, src_height=1, dst_width=124, dst_height=14
src_x=0, src_y=0, dst_x=4, dst_y=0
The problematic conditions can be reproduced by running:
Which shows us that blitters-test would need to run 2-3x more
iterations to detect this problem (right now it runs 2000000
tests). It is also a good idea to run fuzzer-find-diff.pl
script in the "infinite" mode overnight from time to time, or
at least once before releases.
More information about the Pixman